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Preface 


Many years ago, I was sitting in my second-grade classroom when I 
made what I thought was a remarkable discovery: there is no largest 
number. Whatever number I thought of, I realized I could just add one 
to it and get a larger number. This remains to this day one of my most 
vivid childhood memories. What I had “discovered” was the “dot, dot, 
dot” in the infinite collection of the numbers 

1 , 2 , 3 , 4 , 5 , ... , 

which we know as the natural numbers. 

As simple as this collection of numbers may appear, humans have 
been studying these numbers for thousands of years, learning their 
properties, uncovering their secrets, finding one marvelous thing after 
another about them, and still we have only barely begun to tap this 
remarkable and ever-flowing current of ideas. These are the numbers we 
intend to study. 

This book is an introduction to the study of the natural numbers; it 
evolved from courses I have taught at Colorado College, ranging from 
a general math course designed for nonmajors to a far more rigorous 
sophomore-level course required of all math majors. I hope to preserve 
several fundamental features of these courses in this book: 

• Number theory is beautiful. It is fun. That’s why people have 
done it for thousands of years and why people still do it today. 
Number theory is so naturally appealing that it provides a perfect 
introduction — either for math majors or for nonmajors — to the 
idea of doing mathematics for its own sake and for the pleasure we 
derive from it. 

• Although number theory will always remain a part of pure mathe- 
matics (as opposed to applied mathematics), it has also in modern 
times become a spectacular instance of what the physicist Eugene 
Wigner called the “unreasonable effectiveness of mathematics” in 
that there are now important real-world applications of number 
theory. One of the most useful of these applications came along 
several centuries after the original concepts in number theory were 
developed and will be explored in the chapter on cryptography. 

• Number theory is a subject with an extraordinarily long and rich 
history. Studying number theory with due attention to its his- 
tory reminds us that this subject has always been an intensely 
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human activity. Many other mathematical subjects, calculus, for 
example, would have undoubtedly evolved much as they are today 
quite independent of the individual people involved in the actual 
development, but number theory has had a wonderfully quirky 
evolution that depended heavily upon the particular interests of 
the people who developed the subject over the years. 

• Reading mathematics is very different from, say, reading a novel. 
It requires enormous patience to read mathematics. You cannot 
expect to digest new, and often complex, mathematical ideas in a 
single reading. It is frequently the case that multiple readings are 
needed. You will discover that individual sentences, paragraphs, 
and even whole chapters must be read carefully several times before 
the key ideas all fall into place. 

• One of the primary goals of the book is to use the study of number 
theory as a context within which we learn to prove things. Proof 
plays a vital role in mathematics and is the way we bridge the gap 
between what our intuition tells us might be true and the certainty 
about what is true. You will encounter several quite different styles 
of proof as you read (and should feel free to skip any that you find 
either too difficult or simply not very interesting). In many cases, 
an informal argument or even a carefully examined example is 
sufficient to discover truth, but in other cases a far more rigorous 
and formal argument will be required to achieve certainty. 

Another feature of our courses at Colorado College I hope to preserve 
in this book is the interactive nature of our classes. Learning mathe- 
matics requires active participation, and this book should be read with 
paper and pencil in hand, and a good calculator or computer nearby, 
checking details and working things through as you go. Sometimes, in 
order to understand an idea, it is best to go through a few examples by 
hand. Other times it is better to let a computer do the computations, 
and so an introduction to the computer software Sage has been pro- 
vided at the back of the book. Sage is an extremely powerful aide to such 
computations and is a wonderful resource that can be used online or 
downloaded for free. 

The problems at the end of each chapter are an important part of 
the text and you should try to do as many as you can. In this book 
problems are not merely exercises for you to do, but they also introduce 
definitions, explore new ideas, and prove additional results. Much of 
this material will be used later in the book, and so you should be sure 
to read all of the problems, even ones you make no attempt to solve. 
Problems that are either particularly important or explicitly referred to 
later in the book have the symbol + by them. 
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Solutions and answers are provided for many of these problems at 
the back of the book. There is also a separate section containing hints 
for you to consult to get an idea of how to start on a problem if you 
are stuck. Problems for which a hint is available have a letter H after 
them, and problems for which there is a solution or answer have a letter 
S after them. These solutions and hints can be used in a variety of ways: 
to check your answers; to compare your solutions with mine (there are 
often several ways to approach a given problem); to study how to write 
up a solution once you have figured out how to solve a problem; but, 
also, just to read as part of the text, since I occasionally make additional, 
and hopefully useful and interesting, comments about the material in 
these solutions. 

Also at the back of the book are two useful tables. One is simply 
a short list of prime numbers. The other is a pronunciation guide to 
help you with the names of foreign mathematicians. Rather than using 
a phonetic alphabet, these pronunciations are given in a form that 
should make it easy for any speaker of standard (American) English 
to get reasonably close to an accurate pronunciation. So, for example, 
(THA bit) is used for the ninth-century Arab mathematician Thabit ibn 
Qurra, rather than the phonetically correct (0a:bit). 

It is probably obvious that covering all of the material in this book 
in a typical number theory course is not possible. I tend to think of 
Chapters 1-10 forming the core material and the topics covered in 
Chapters 11-15 being optional, perhaps to be done by students either 
individually or in groups as independent study. In the table of contents 
1 have marked individual sections that I consider critical with a ★ . 

Many people have at various stages helped me write this book. The 
first and foremost was the long-time chair of our math department, 
Dave Roeder. It was Dave who put a number theory course at the 
very core of our math curriculum, and over the years it became my 
very favorite course to teach. I also owe a deep debt to my colleague 
Stefan Erickson who, unlike me, is a real number theorist and has used 
numerous drafts of this book in his own course on number theory. 
Stefan guided me with enormous patience through draft after draft. He 
also provided me with extraordinarily detailed student feedback from 
these courses. One result of this extensive “field-testing” is that there 
have been many students whose comments greatly improved this book. 
In particular, two of these students deserve special mention. Gautam 
Webb’s careful reading of the latest draft uncovered more errors than 
I would have believed possible. Marina Gresham did the same sort of 
meticulous reading of several early drafts; more importantly, I relied 
almost exclusively on Marina’s excellent judgment in deciding which 
problems needed to be provided with hints and solutions. 
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Finally, I would like to say that while this book is modeled upon 
specific courses I have taught at Colorado College, this book is nonethe- 
less intended for a far more general audience; and so there is almost 
nothing in terms of prerequisites that a readers need to bring along with 
them except enthusiasm and curiosity. That is one of the fundamental 
charms of number theory. It really does begin with 

1 , 2 , 3 , 4 , 5 , ... , 

and you can’t be too young, or too old, to enjoy this amazing story. 


John /. Watkins 
Colorado Springs, Colorado 




Number Theory Begins 


Pierre de Fermat 

Any good story can be told in a variety of ways. Often it is simply best 
to let events unfold in strict chronological order. But one must then 
sometimes pause to backtrack in the story every so often in order to 
explain one or two things that may not be clear to the audience. One 
can even tell a tale by beginning at the very end and spin the entire 
story out in a series of flashbacks, slowly and tantalizingly revealing 
everything one layer at a time. 

I have chosen to tell the story of number theory by beginning with 
the first person who really thought about numbers in much the same 
way as we do today, and for this reason he is the first mathematician 
who could accurately be described as a number theorist. The man’s 
name is Pierre de Fermat. The year is 1659 and Fermat has just written 
to his friend Christiaan Huygens bragging about having discovered a 
“most singular method” for proving mathematical propositions and 
mentioning as an example one of his most important early results: 

There is no right-angled triangle in numbers whose area is a 

square. 

Let us begin our story of numbers here then, three hundred and 
fifty years ago with Fermat’s proof of this proposition. His proof is 
actually quite short, but we will spend a great deal of time in this chapter 
developing his proof because I want to use this proposition as a way to 
introduce you informally to several basic ideas and topics in the theory 
of numbers. And so I intend to present the proposition in a series of 
flashbacks so that when we get to the actual proof you already know 
everything that Fermat knew when he discovered this proof. So try to 
keep in mind as you read this chapter that our ultimate goal is the proof 
of Fermat’s proposition that no right-angled triangle has square area. 


Pythagorean Triangles 

The first thing to understand about this proposition is that Fermat is 
considering only whole numbers, what we now call integers — that is, an 
integer is a number in the set Z = {. . . , -3, -2, -1, 0, 1, 2, 3, ... } (by 
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Figure 1.1 Pierre de Fermat, 
1601-65. 


the way, the letter Z is used for this set because the German word for 
numbers is Zahlen). 

So when Fermat says “triangle in numbers” and “whose area is 
square” he means that the three sides of the triangle are all to have 
integer length and that the area is also an integer that is itself the 
square of another integer. In other words, all of the numbers in Fermat’s 
proposition come from the set of natural numbers 

N = {1, 2, 3, . . . }. 

These days we call such right triangles Pythagorean triangles— a reference 
to the well-known Pythagorean theorem of high school geometry— and 
if three natural numbers a , b , c are such that a 2 + b 2 = c 2 , then we call 
this set of numbers a Pythagorean triple. Furthermore, a Pythagorean 
triple { a , b , c } is said to be primitive if the three numbers have no 
common positive factor other than 1 . 

Why are we interested in primitive Pythagorean triples? Because 
primitive Pythagorean triples represent fundamentally different trian- 
gles. For example, the two triples {3, 4, 5} and (6, 8, 10} correspond 
to two triangles that have exactly the same shape, so the only way 
these two triangles really differ is in terms of their size; one triangle 
is simply an expanded version of the other triangle. When trying to 
prove propositions such as the one of Fermat’s about right triangles 
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not having square area, it is often much easier not to consider all 
right triangles, but just to consider those corresponding to primitive 
Pythagorean triples, such as {3, 4, 5}, {5, 12, 13}, and {8, 15, 17}. 


Babylonian Mathematics 

This brings us to a serious flashback because Pythagorean triangles have 
been known about for a very long time, even long before the time 
of Pythagoras, who himself lived in about the sixth century B.C. For 
instance, we happen to know that they were an important part of early 
Babylonian mathematics because the records of that Mesopotamian 
empire were kept by scribes on clay tablets written in a style known as 
cuneiform script (because of the distinctive wedge-shaped marks made 
in these clay tablets using a stylus). Many of these tablets survived to 
this day because of the dry climate of that region— Babylon, the capital 
of this empire, was located on the Euphrates about sixty miles south 
of present-day Baghdad. One of these ancient tablets somehow made 
its way into a private collection in Florida before finally becoming a 
permanent part of the Plimpton collection at Columbia University, 
where it was given the catalog number 322. This particular tablet is now 
quite famous and is called, simply, Plimpton 322. 
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Plimpton 322 has been described by mathematician and science 
historian Otto Neugebauer as “one of the most remarkable documents 
of Old-Babylonian mathematics.” It contains fifteen rows and four 
columns (although there is some damage), the fourth column be- 
ing just a numbering of the rows 1 through 15. Until Neugebauer 
deciphered this tablet it had been considered merely a “commercial 
account.” What Neugebauer managed to figure out is that instead 
this tablet effectively contains a list of fifteen Pythagorean triples: the 
middle two columns are the hypotenuse and the shortest side of right 
triangles. For each of these fifteen triangles, if we call the hypotenuse c 
and the shortest side a, and then compute Vc 2 - a 2 , we get an integer; 
in other words, they knew the Pythagorean theorem in this part of the 
world twelve hundred years before Pythagoras! 

The rows on the tablet begin in the first row with a right triangle 
{119, 120, 169} that is nearly isosceles— that is, the two legs are almost 
equal— and the triangles gradually change shape as you move down the 
tablet until you end at the bottom row with a right triangle {56, 90, 106} 
whose legs are not at all equal. It is worth noting that the largest triangle 
on the list is the fourth one {12 709, 13 500, 18 541} and remembering 
that this triangle was computed thirty-five hundred years before calcu- 
lators. The first column turns out to contain the numbers c 2 /(c 2 - a 2 ), 
so this column records the square of the ratio of the hypotenuse to the 
“third” side— in modern terminology this would be represented as 
the square of the cosecant of the angle between the hypotenuse 
and the shortest side— and this ratio gradually diminishes as you go 
down the tablet. 

At this point we could happily end this flashback into Babylonian 
mathematics satisfied that we have seen ample evidence of an aware- 
ness of the Pythagorean theorem from such a distant time in the hu- 
man past, but Neugebauer discovered something even more interesting 
about Plimpton 322. He discovered why this tablet contains the fifteen 
Pythagorean triples that it does. To understand this, and thus continue 
our flashback, we need to talk about the way in which the Babylonians 
represented numbers. 


Sexagesimal Numbers 

In the 1997 film Contact, based on a novel of the same name by Carl 
Sagan, Jodie Foster plays the role of a brilliant astronomer who is the 
first human to receive a message from extraterrestrial beings. She has 
been monitoring an array of radio telescopes in New Mexico and knows 
the signal she is receiving from a distant star can only be coming 
from intelligent life because the signal is repeating a sequence of prime 
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numbers over and over again. What better way can there be to shout 
into the universe “We are here” than to send a message about numbers 
that can be universally understood by anyone who hears it? 

Number theory is the study of inherent properties of numbers. For 
example, whether a number is odd or even is an inherent property 
of a number; it doesn’t depend upon how the number is represented. 
The number 17 is odd whether we represent it, as we just did, in the 
familiar decimal system, or as XVII, in the Roman numeral system, or 
as 10001, in the binary system. Similarly, the fact that 17 is a prime 
number doesn’t depend on how it is represented. That’s why any curi- 
ous, intelligent life form anywhere in the universe is eventually going to 
discover prime numbers. The fictional beings from that distant planet 
revolving around the star Vega who sent us a message in that movie 
had discovered prime numbers. And here on earth, thirty-five hundred 
years ago, the Babylonians had also discovered prime numbers, and had 
been fascinated by them. 

The Babylonians used a sexagesimal number system much like our 
own decimal system, but based instead upon the number 60. They got 
the idea from the Sumerians and, in fact, we still use a version of their 
system today for some parts of our lives. We measure time in units of 
sixty: 60 minutes in an hour, and 60 seconds in a minute. We measure 
angles and navigate using degrees: 360 degrees in circle, 60 minutes 
of arc in a degree, 60 seconds of arc in minute of arc. The reason for 
choosing 60 as a base for a number system, and the reason we still use 
it for some purposes today, is that 60 has so many different factors. In 
particular, then, an hour, or a circle, can conveniently be broken up into 
1, 2, 3, 4, 5, 6, or even more parts. 

So, how does a sexagesimal system work? Well, when we write a 
number such as 3456 in our decimal system, what we mean is that 

3456 = 3 x 10 3 + 4 x 10 2 + 5 x 10 1 + 6 x 10°. 

How do we write this same number in the Babylonian system? We need 
to write it in terms of powers of 60, that is, as something like 

a x 60 2 + b x 60 1 + c x 60°, 

where we just have to figure out what a, b, and c need to be. But 60 2 = 
3600, so it turns out we don’t need the a x 60 2 term at all for 3456. 
So divide 60 into 3456 and get 57.6, which means that we should let 
b = 57, and then c = 3456 - 57 x 60 = 36. Therefore, in the sexagesimal 
system we would express 3456 as 57 x 60 1 +36 x 60°. We write this more 
conveniently as 57, 36 meaning 57 “sixties” plus 36 “ones.” 
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Let’s do one in the other direction. What number does the sexagesi- 
mal number 4, 37, 46, 40 represent? It represents 

4 x 60 3 + 37 x 60 2 + 46 x 60 1 + 40, 

which is equal to 4 x 216 000 + 37 x 3600 + 46 x 60 + 40 = 1 000 000. 
Note that we are using the international system of marking off the 
digits in large numbers by groups of three using spaces rather than 
commas. 

Now, of course, in the decimal system we have ten different symbols 
for our ten digits, but as you can imagine the Babylonians didn't have 
sixty different symbols to use. Instead, for the number 57 they would 
just make seven vertical marks next to five marks shaped like < that 
each represented 10. You might barely be able to make out these two 
kinds of marks in Figure 1.2. Another way in which the Babylonians’ 
system differed from our current decimal system is that they didn’t have 
a symbol for zero, so they just left a blank space. This meant that two 
very different numbers such as 7232 and 432 032 would look the same 
since their sexagesimal representations are, respectively, 2, 0, 32 and 
2, 0, 0, 32, and so, with only a space between 2 and 32 in each case, 
there would be no way to tell what power of 60 to use for the 2. This 
would have to be inferred from the context, which was not usually at all 
difficult in practice. 


Regular Numbers 

The Babylonian system differed from ours in still another way: it did not 
use a decimal point, or, rather, we should say a sexagesimal point. For 
us the numbers 3456 and 3.456 are very different. The latter number, of 
course, means 


3 + 4 x 10’ 1 + 5 x 10~ 2 + 6 x 10“ 3 . 

The Babylonian system on the other hand was more flexible— and, 
again, context would be used to resolve any ambiguity. For example, 
5, 30 could represent 330, that is, 5 x 60 + 30; but it could also represent 
5 + 30 x that is, 5|. 

As another example, we could represent the fraction ^ in the 
sexagesimal system as 1, 2, 30 because 

1 _ 1 2 30 

3456 _ 602 + 60 1 + 60* ‘ 
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(Check this if you want.) This is really rather remarkable. A fraction such 
as 3 ^ 5 g, which in the decimal system becomes 0.000 289 351 85 ... , 
and goes on forever, has in the sexagesimal system a finite 
representation. 

There is a simple reason this happens, and it has to do with the factors 
of 60. Since 60 = 2 2 • 3 • 5, the only prime factors of 60 are 2, 3, and 5. The 
Babylonians discovered that for any number that has no prime factors 
other than 2, 3, or 5 — such as 3456 = 2 7 -3 3 , for example — the reciprocal 
of the number has a finite representation. But if you take any other 
number and try to write its reciprocal as a sexagesimal fraction, this 
fraction will go on forever. So, numbers that have no prime factors other 
than 2, 3, and 5 were very important to the Babylonians. Neugebauer 
called these numbers regular numbers. 

Now, let’s look again at Plimpton 322 and the fifteen triangles it 
contains. Here is a list of all fifteen of the “third” sides of these triangles, 
which we get by computing s/c 2 - a 2 : 120, 3456, 4800, 13500, 72, 360, 
2700, 960, 600, 6480, 60, 2400, 240, 2700, 90. Notice that the number 
3456 is on that list. These are not just any old integers, but they all share 
with 3456 the special property that was very important in Babylonian 
mathematics: the only prime factors of any of these numbers are 2, 3, 
and 5 — that is, they are all regular numbers! 


Square Numbers 

I hope that during this flashback to Babylonian mathematics you 
haven’t forgotten about Fermat’s proposition that no Pythagorean tri- 
angle has a square area. As it happens, one of the earliest translations 
that was ever done of a Babylonian clay tablet was of a tablet that is 
nothing more than a table that lists the numbers from 49 to 60 and 
their squares. The property of a number being a square was something 
that was very important to the Babylonians. 

The square numbers are the numbers 1, 4, 9, 16, 25, . . . and, as you 
can infer from their appearance on a Babylonian clay tablet, these 
particular numbers have fascinated people since ancient times. When 
you saw this list of square numbers just presented to you, you undoubt- 
edly thought to yourself something like: of course I recognize 9 is a 
“square” number because 9 = 3 2 , and 16 is a “square” number because 
16 = 4 2 . But twenty-five hundred years ago a young Greek student of 
mathematics in the city-state of Ionia would have thought something 
more like: of course 9 and 16 are “square” numbers because piles of nine 
stones and sixteen stones can each be arranged into square arrays 
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of stones on the ground. One of you thinks of the concept “square 
number” algebraically, and the other thinks of the same concept 
geometrically. 

this will become a recurring theme as we continue our study of 
number theory. Just as we did with square numbers we will assign 
various traits to numbers and speak of there being prime numbers, reg- 
ular numbers, perfect numbers, triangular numbers, Fibonacci numbers, 
Mersenne numbers— the list goes on and on, each term describing num- 
bers that have a particular property that we find interesting. For each 
of these categories of numbers you will want to try to get a feeling for 
what makes that kind of number special. For square numbers we have 
lost in modern times that “feel” of the geometric quality that makes 
them special. Fermat undoubtedly had a much fuller appreciation of 
both the algebraic and geometric nature of square numbers than we do 
today. One of the great mathematicians of modern times, Paul Erdos, 
was legendary for the “feel” he had for numbers. At an international 
conference in Boca Raton, Florida, in 1994, Erdos expressed this great 
affection he had for numbers during one of his famous annual addresses 
to the conference in a typically humorous way by telling the audience 
that he suspected he was, at the age of eighty-one, “probably a square 
for the very last time.” 

As for Fermat’s proposition about triangles, the reason we are able to 
restrict our attention to primitive Pythagorean triples is because of the 
intimate way area is linked to squaring. Suppose we have two similar 
triangles such as a {3, 4, 5} triangle and a {6, 8. 10} triangle. The larger 
triangle has sides that are twice as long as those in the smaller triangle, 
but its area is four times that of the smaller triangle. Area = \ (base x 
height), so the small triangle has area \ • 3 • 4 = 6, and the larger triangle 
has area \ ■ 6 ■ 8 = 24. 

Similarly, the {9, 12, 15} triangle has sides that are three times as long, 
but area that is nine times that of the smaller {3, 4, 5} triangle, since its 
area is \ ■ 9 • 12 = 54. In this same way, any Pythagorean triangle that is 
similar to the primitive triangle {3, 4, 5} will have an area that is a square 
multiple of the area of this smaller triangle. Thus, since the {3, 4, 5} 
triangle does not have square area, no Pythagorean triangle similar to it 
can have square area. (Of course, if the {3, 4, 5} triangle did have square 
area, this same argument would mean that any Pythagorean triangle 
similar to it would also have square area.) 

Note that this argument relies entirely on a basic fact about numbers 
that we will have to say more about later, namely, if you multiply a 
square number by a square number you get another square number, 
but if you multiply a non-square number by a square number you get 
a non-square number. 
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Primitive Pythagorean Triples 

So, to prove his proposition, Fermat needed to know exactly which 
triples of numbers form primitive Pythagorean triples. Not only did 
Fermat know this, but this is something that had been known for a 
long time. Neugebauer even thinks the Babylonians knew this! And 
that brings us once again to Plimpton 322. 

It turns out that the scribes made four errors on this tablet, and the 
nature of these errors makes it clear that the first column was not being 
computed directly from the middle two columns. In other words, they 
must have had some other method for producing the numbers in these 
three columns. Neugebauer believes that behind the scenes for each row 
of Plimpton 322 lay two small regular numbers s and t. For example, for 
the first row there would be the two regular numbers s = 12 and t = 5. 
Then, the numbers for the middle two columns would be computed by 
finding 


c = s 2 + t 2 ; a = s 2 — t 2 . 

For the first row, you would get c — 12 2 + 5 2 = 169, and a — 12 2 - 5 2 = 
119. The number in the first column would be computed by finding 
((s 2 + t 2 )/(2st)) 2 , which in this case is (169/120) 2 , and this is exactly 
the number 1.59, 0, 15 written sexagesimally that Neugebauer found 
on the tablet — or, at least he found the 15, the rest had been obliterated. 
The “third” side of this triangle, call it b, would be computed as b = 1st. 
Here, you would get b — 120. 

When viewed in this light, Plimpton 322 starts to make a lot more 
sense. Take, for example, the largest triangle on the list, represented by 
the triple { 12709, 13500, 18541 j. At first glance this triangle seems quite 
out of place. But for this triple, the values of the two regular numbers s 
and t are s = 125 and t = 54, which again seems somewhat arbitrary 
until we look at their prime decomposition: 125 = 5 3 and 54 = 2 • 3 3 . 
So, in fact, these are very simple numbers built from the fundamental 
building blocks in the Babylonian system: 2, 3, and 5. This same pattern 
holds for each row on the tablet: one row would have s = 2 s and 
t = 3 • 5; another, s = 2 • 5 2 and t = 3 3 . There is one exception: the row 
for the triangle {45 , 60, 75) is the only row where the numbers have a 
common factor. This triangle of course is similar to the triangle {3, 4, 5} 
(for which we can use s — 2 and t = 1) but is much more in the same 
scale as the other triangles on the list. 

Row after row of the tablet, the values of s and t satisfy four proper- 
ties: (i) s > t; (ii) one of s or t is even, and the other is odd; (iii) they 
are both regular numbers, that is, their prime decompositions use only 
the three primes 2, 3, and 5; and, finally, (iv) s and t are relatively prime, 
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that is, they have no common factor other than 1. These values of s and 
t never appear on the tablet of course; they are merely somewhere in 
the background. Perhaps the Babylonians were aware of these numbers 
and used them in their calculations, perhaps not. Perhaps they just used 
a simpler formula such as b 2 = (c + a)(c - a)— not that they would 
have been able to express it in this modern algebraic form. We just don’t 
know. 

Whether or not the Babylonians actually produced Plimpton 322 in 
the highly number theoretic way I have been suggesting, we are now 
ready to characterize primitive Pythagorean triples. This theorem, our 
first, was certainly known to Euclid, who gave a proof for this marvelous 
construction in the late fourth century B.C. 


Theorem 1.1. For any primitive Pythagorean triple (x, y. z\ where 
x 2 + y 2 = z 2 , one of the numbers x or y must be even, and the other odd, 
so let x be the even number; then, there exist two positive integers s and t, 
s > t, one even and the other odd, with s and t having no common factor 
other than 1, such that 


x — 2st; y = s 2 — t 2 ", z = s 2 + t 2 . 

Moreover, if s and t are any two such positive integers, then these formulas 
produce a primitive Pythagorean triple. 


Proof 

First, we note that, for a primitive Pythagorean triple {x, y, z], x and 
y cannot both be even, since then z would also be even and all three 
integers would have 2 as a factor. In order to show similarly that x and v 
cannot both be odd we will give an argument — much as Fermat would 
have done — based on an idea we will use often in this book having to 
do with the notion of remainders. 

We know that a number can be either even or odd. We express this 
by saying that an even number can always be written in the form 2k, 
where k is also an integer; and that an odd number can always be written 
in the form 2k+ 1, where k is again an integer. So, for example, 26 = 2- 13 
and 57 = 2 • 28 + 1. Another way of saying this is that if we divide an 
integer by 2, there are only two possible remainders: 0 and 1. 

Now, we want to show that x and v cannot both be odd, so let’s see 
what would happen if they both were odd. That is, let us suppose that x 
and y are both odd. That means we can write x = 2k + 1 and y — 2j + 1. 
(We have to use a different letter / for y because we don't want to assume 
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that a: and y are equal.) Now we can compute 

a 2 + y 2 = (2k + l) 2 + (2/ + l) 2 = 4k 2 + 4k + 1 + 4/' 2 + 4/ + 1 
= 4 (k 2 + k+ j 2 + j) + 2. 

We conclude that a 2 + y 2 is a number of the form 4 i + 2 (where in this 
case i is just the number k 2 + k + j 2 + /). Another way of saying this is 
that if we were to divide a 2 + y 2 by 4, we would get a remainder of 2. For 
example, since 70 = 4 • 17 + 2, we say that 70 is of the form 4 i + 2, or 
that if we divide 70 by 4 then the remainder is 2. 

But recall that x 2 + y 2 = z 2 , so z 2 must also be of the form 4 i + 2, 
and have a remainder 2 when divided by 4. However, as we shall see, 
this is impossible, because a square can never be of the form 4 i + 2 and 
have remainder 2 when divided by 4. Why not? Well, if z is an even 
number, then z can be written as 2k, and then z 2 = (2k) 2 = 4 (k 2 ), and 
z 2 has a remainder 0 when divided by 4; on the other hand, if z is an 
odd number, then z can be written as 2k + 1, and then z 2 = (2k + l) 2 = 
4 k 2 + 4k + 1 = 4(k 2 + k) + 1, and z 2 has a remainder 1 when divided 
by 4. That is, squares have remainder 0 or 1 when divided by 4. 

So, we found out exactly what happens if we assume that both a and y 
are odd. We end up with the conclusion that z 2 has a remainder 2 when 
divided by 4, but we know this is impossible. Therefore, our assumption 
had to be wrong, and we conclude that one of the numbers a or y must 
be even. And since we already decided they can’t both be even, the other 
number is odd. We arbitrarily decide to let a be the even number and y 
the odd number. Note, then, that z will always be odd. 

Since y and z are both odd, the numbers z + y and z — y are even, so 
we can write them as z - y = 2» and z+y = 2v. (Note that it follows that 
z — u+v and y = v - u.) Thus a 2 = z 2 — y 2 — (z— y)(z + y) = 4uv, and so 
(|) 2 = uv. So uv is a square (note that | is an integer since a is even). We 
claim that u and v are both squares. 

To verify this claim we need to understand what makes a number a 
square in terms of its prime decomposition: each prime needs to occur 
an even number of times. So, 2 2 • 5 6 • ll 4 will be a square, but 3 8 ■ 7 3 won’t 
be a square. This means a product such as uv can be a square as long as 
each prime collectively shows up an even number of times in the prime 
decompositions of the two numbers u and v. But what if u and v don’t 
have any primes in common? Then the only way uv can be a square is if 
u is a square and v is a square. That’s the situation in our proof and what 
we need to show, namely, that u and v are relatively prime. But if u and 
v have a common positive factor d, then d is also a common factor for y 
and z, and hence a factor of a. Since (a, y, z) is a primitive Pythagorean 
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triple, the only way this can happen is if d = 1. We conclude that u and 
v are both squares. 

Therefore, we can write u = t 2 and v = s 2 , and get 

* 2 = 4 uv = 4s 2 t 2 ; y = v — u = s 2 — t 2 ; z = u + v = s 2 + t 2 , 

exactly as desired. Note that since u and v are relatively prime, so are s 
and t; this means that, in particular, one of s and t is even, and the other 
odd. 

Finally, we still need to verify the converse, namely, that if s and 
t are any two positive integers with the given properties, then these 
formulas will produce a primitive Pythagorean triple. That the formulas 
produce a Pythagorean triple is straightforward algebra since x 2 + y 2 = 
(25 1) 2 + (s 2 - t 2 ) 2 = (s 2 + t 2 ) 2 - z 2 . If the triple is not primitive, then 
there is a prime number p that is a factor for all three numbers x, y, 
and z. Since one of s and t is even and one is odd and z = s 2 + t 2 , 
we know that z is odd. So, in particular, we know that p can’t be 2. 
Now, p is a factor of both v and z, so p is also a factor of their sum 
y + z = (s z - t 2 ) + (s 2 + t 2 ) = 2s 2 . Therefore, since p is a prime number 
other than 2, p must be a factor of s. In the same way p also is a factor of 
the difference z-y = (s 2 + 1 2 ) - (s 2 - t z ) = 2 1 2 , and so p is a factor of t as 
well. But this is a contradiction since s and t are supposed to have been 
relatively prime integers, so no prime could be a factor of both. Thus the 
triple must have been primitive after all. This completes the proof of the 
theorem. ■ 

The area of the triangle given by a primitive Pythagorean triple 
{x, y, z } is \xy, and so the area of this triangle must by an integer 
because, by Theorem 1.1, one of x or y is even. Thus, the area of any 
Pythagorean triangle is an integer. 


Infinite Descent 

We are ready to go back to 1659 to see how Fermat proved his propo- 
sition. Fermat gave the barest outline of a proof of his proposition 
in his letter to Huygens saying only “if the area of such a triangle 
were a square, then there would also be a smaller one with the same 
property, and so on, which is impossible.” Nevertheless, this single brief 
statement does capture beautifully the essence of the “most singular 
method” of proof of which Fermat was so justifiably proud, and which 
he called the infinite descent. 

Fermat’s method of infinite descent is based on a very simple idea. 
In order to prove that no Pythagorean triangle can have square area, 
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you assume that there is one that does have square area and show how 
to produce a smaller Pythagorean triangle that also has square area. If 
you can do that, you would be done, because you could then repeat 
the same process on the smaller triangle and get a still smaller triangle 
with square area, and again to get a still smaller triangle, and so on, 
forever. Why is this impossible? Because of the positive integers 1, 2, 
3, . . . . If A] is the area of the first triangle, and Ai is the area of the 
second triangle, and so on, then we have produced an infinite sequence 
of strictly decreasing positive integers 


Ai > A2 > A3 > A4 > ■ ■ ■ , 


which is an impossibility within the natural numbers. It is that simple. 

What Fermat left out of his letter because it “would make his dis- 
course too long” was any discussion at all about how to take one 
Pythagorean triangle with square area and produce from it a smaller 
Pythagorean triangle that also has square area. In other words, he left 
out his proof! He did, however, at least write down his proof in the mar- 
gin of a book — of a very famous book — his copy of Bachet’s translation 
of Diophantus, a book we discuss at some length in Chapter 4. Here is 
that proof. 


Theorem 1 .2. No Pythagorean triangle has square area. 


Proof 

This proof will use Fermat’s method of infinite descent. Our strategy, 
therefore, will be to assume there is a Pythagorean triangle with square 
area, and then produce a smaller Pythagorean triangle with square area. 
Using infinite descent, that is all we have to do in order to prove the 
theorem. It is also worth recalling from our discussion in the section on 
square numbers that any Pythagorean triangle with square area will be 
similar to a primitive Pythagorean triangle that has square area, so we 
can focus on primitive Pythagorean triangles in this proof. 

We start the proof — using the notation of Theorem 1 . 1 — by assuming 
that [2 st. s 2 — t 2 , s 2 +t 2 } is a primitive Pythagorean triple that represents 
a triangle whose area is square. The area of this triangle is A = \xy, that 
is, A = st(s + t)(s - t). Since A is a square and s and t are relatively prime, 
all four terms in this expression for A are also relatively prime to one 
another; therefore, all four terms are themselves squares, and we can 
write s = a 2 ,t = b 2 ,s + t = u 2 ,s-t = v 2 . Note that u and v must both 
be odd, and are relatively prime. 
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Now, let’s look at the three squares v 2 , a 2 , and u 2 . We can compute 

a 2 - v 2 = s - (s - t) = t = b 2 and u 2 - a 2 = (s + t) - s = t = b 2 , 

and so we see, first of all, that v 2 < a 2 < u 2 . 

Moreover, we see that the difference between v 2 and a 2 is b 2 , and that 
the difference between a 2 and u 2 is also b 2 . In other words, a 2 is in the 
exact middle between v 2 and u 2 . (For example, here are three squares 
49 < 169 < 289 where 169 is in the exact middle.) We conclude that 
u 2 - v 2 = 2b 2 , and we factor this to get 2 b 2 = (u + v)(u - v). 

Next, we observe that since u and v are odd, u+v and u-v will each be 
even. But one of them will be exactly divisible by 4, and the other one 
won’t. (This is because these two numbers differ by 2v and v is an odd 
number— that is, they differ by 2v = 2(2k + 1) = 4 k + 2.) So, whichever 
one of them is exactly divisible by 4, we write that one as 4 n 2 , and we 
write the other one as 2m 2 . Then u = i((u + v) + (u - v)) = m 2 + 2 n z , 
and v — ~((u + v) — (u — v )) = ±(nf — 2 n 2 ) — the plus or minus depends 
on which number was exactly divisible by 4, and 2 b 2 = (2m 2 )(4n 2 ), so 
b = 2 nm. 

But we now have a smaller Pythagorean triangle. Taking m 2 and 2 n 2 
as the two “legs” for this smaller triangle we compute 

(m 2 ) 2 + (2 n 2 ) 2 = m 4 + 4n 4 = ~ ((m 2 + 2 n 2 ) 2 + (m 2 - 2 n 2 ) 2 ) 

= \ {u 2 + (±u) 2 ) = ^(w 2 + v 2 ) = a 2 , 

and so m 2 and 2n 2 form the legs of a new Pythagorean triangle whose 
hypotenuse is a, whereas the hypotenuse of the original triangle was 
s 2 + 1 2 = a 4 + b 4 . So this new triangle is definitely smaller. Also, the area 
of this new triangle is nfn 2 , so it has square area, namely, (nm) 2 . 

Thus we have accomplished what we set out to do: we produced 
a smaller Pythagorean triangle with square area. Hence, by infinite 
descent, we are done. This completes the proof of the theorem. ■ 


Arithmetic Progressions 

The situation that occurred in the proof of Theorem 1.2, where there 
were three squares v 2 < a 2 < u 2 with a 2 in the exact middle, is worth 
another look. The example we gave there was 49 < 169 < 289 where 
the square 169 is in the exact middle. In this case the common difference 
between 49 and 169, and between 169 and 289, is 120. 
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This is a specific case of what we call an arithmetic progression. An 
arithmetic progression is just a sequence of numbers — and the se- 
quence can be either finite or infinite — where any successive pair of 
numbers in the sequence has a constant difference, and we call this 
constant difference the common difference. So, the infinite sequence 
7, 12, 17, 22, 27, . . . , is an arithmetic sequence where the common 
difference is 5. Or, the finite sequence 49, 169, 289, 409, 529 is an 
arithmetic progression where the common difference is 120. 

There is an interesting connection between Pythagorean triangles 
and squares in an arithmetic progression. Suppose three squares, a 2 , b 2 , 
and c 2 , are in an arithmetic progression. This means that b 2 - a 2 = 
c 2 — b 2 . Let x = ^ and y = then, since b 2 is the middle term in 
the progression, we get 

, 2 a 2 + c 2 (c + a) 2 + (c - a) 2 2 2 

b 2 = — ^ =x 2 + y 2 . 

(In this chain of three equalities, the first equality holds because b 2 , 
being the middle term in an arithmetic progression, is the average 
of the two terms on either side; the second equality can be verified 
easily by expanding (c + a) 2 + (c - a) 2 ; and the last equality follows 
immediately from the definitions of x and y.) Thus we see that we have a 
Pythagorean triangle [x. y, b), where the hypotenuse b comes from the 
middle square. 

Moreover, the common difference, d, of this arithmetic progression 
is given by 


d = 



(c + a)(c -a) 

2 = 2), - V ' 


(Again, in this chain of three equalities, the first equality holds because 
in the arithmetic progression a 2 , b 2 , c 2 the two terms c 2 and a 2 differ by 
2d, that is, by two of the common difference d; the second equality is 
obvious; and the last equality once again follows immediately from the 
definitions of x and y.) 

But the area of this triangle is \xy, so we also see the remarkable 
fact that the common difference of the original arithmetic progression 
is four times the area of the Pythagorean triangle. What is even more 
remarkable is that this connection between Pythagorean triangles and 
three squares in an arithmetic progression has been known for more 
than a thousand years. 

Let’s look at the example we just mentioned during the proof of 
Theorem 1.2: 49 < 169 < 289. In this case, a — 7, b = 13, c = 17, 
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so we get x = = 12 and y = — 5. The corresponding triangle is 

{12, 5, 13), which is a Pythagorean triangle, and the area of this triangle 
is 5 (12)(5) = 30. The common difference of the three squares, 120, is in 
fact four times this area, 30. 

We’ll do another example of this remarkable connection by looking 
at a very old problem concerning the area of Pythagorean triangles 
from an eleventh-century Byzantine manuscript found in a library in 
Istanbul (formerly Constantinople): find a Pythagorean triangle of area 
5m 2 . Note that this is similar to the proposition Fermat addressed in 
Theorem 1.2, except that the area is not a square, but a multiple of a 
square. The writer of this problem — writing almost a thousand years 
ago — begins his solution of this particular problem very casually by 
saying “we must take for m 2 a multiple of 6 ”; he then lets m = 6 , which 
is certainly an easy way to make sure that nf is a multiple of 6 . 

Did you know that the area of a Pythagorean triangle is always divis- 
ible by 6 ? Apparently this was common knowledge in Constantinople a 
thousand years ago. Nevertheless, we had better check this fact. We’ll 
do this by showing that the area is divisible by 2 , and also that it is 
divisible by 3; hence it is divisible by 6 . (Note that this line of reasoning 
works only because 2 and 3 are relatively prime — just because a number 
is divisible by 10 and 15 doesn't mean it is divisible by 150; for example, 
30 is divisible by both 10 and 15 but not by 150.) 

So, let {x, y, z } be a Pythagorean triangle as in Theorem 1.1 (we can 
restrict our attention to primitive Pythagorean triangles because if the 
area of a triangle represented by any primitive Pythagorean triple is 
divisible by 6 , then the area of any Pythagorean triangle will also be 
divisible by 6 ). Then the area of this triangle is given by A = ^xy. But 
recall that x = 2st where one of s and t is even, which means that x is in 
fact divisible by 4. Therefore, the area A is divisible by 2, as desired. 

Next we show that A is also divisible by 3. If x is divisible by 3, 
we’ll be done, and if either s or f is divisible by 3, then this is obvious, 
so let’s suppose that neither s nor t is divisible by 3. This means that 
when we divide either s or t by 3 we get a remainder of 1 or 2. If s 
and t happen to have the same remainder, then s - t will be divisible 
by 3, whereas if s and t happen to have different remainders, then 
s + t will be divisible by 3 (simply because 1 + 2 = 3). So, either way, 
y = s 2 — t 2 = (s — t)(s + t) will be divisible by 3. In other words, if 
x isn't divisible by 3, then y will be. So, in a Pythagorean triangle, not 
only is one of the two legs divisible by 4, but one of the two legs is 
divisible by 3. (As we said, this fact has been known for at least a 
thousand years.) Hence the area is divisible by 6 . 

The other thing the writer of this problem knew about was the 
connection between Pythagorean triangles and squares in arithmetic 
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progression. So, since he had decided to let m = 6, he was looking for a 
triangle whose area is 5 • 6 2 = 180, and therefore he knew all he had to do 
was find three squares in an arithmetic progression where the common 
difference is four times that area; that is, the common difference should 
be 720. 

The rest was easy for him because if the common difference d should 
be 720, then — using our previous notation where d = 2xy — we see that 
xy — 360 would work. Then he chose x = 9 and y = 40 as factors for 
360, and got 9 2 + 40 2 = 41 2 . That's how he did it. So his answer for 
this problem is the triangle {9, 40, 41), which does have an area of the 
desired form since \ • 9 • 40 = 180 = 5 ■ 6 2 . 

While we are at it, let’s check the arithmetic progression. The middle 
square should be 41 2 , and the common difference is supposed to be 
720. Are 41 2 - 720 and 41 2 + 720 both squares as they should be? Well, 
41 2 - 720 = 961 = 31 2 and 41 2 + 720 = 2401 = 49 2 , and so 31 2 , 41 2 , 49 2 
form an arithmetic progression. 


Fibonacci's Approach 

Fibonacci — who was born in Pisa around 1180, but grew up in North 
Africa and traveled extensively — could also solve this problem about 
finding a Pythagorean triangle of area 5 m 2 because he knew what 
we now know, namely, that the common difference for the related 
arithmetic progression of three squares would be given by 

d = 2 xy = 4sf(s 2 - t 2 ), 

where we are again using the notation of Theorem 1.1. 

Hence 8 is going to divide d (because s or t is even), 3 is going to 
divide d (because, as we just saw, 3 divides x or y), and 5 is also going 
to divide d (because the area, \xy, is supposed to be 5m 2 ). Thus d is a 
multiple of 120. Then Fibonacci just picked two convenient small values 
of s and t to make this happen, namely, s = 5 and t = 4, which yields the 
same value d = 720 as before. So Fibonacci gets the exact same triangle 
since x = 2 st = 2(5)(4) = 40 and y = s 2 - t 2 = 5 2 - 4 2 = 9. 

But Fibonacci noticed something else interesting about problems 
such as these. First of all, a square n 2 can be written as the sum of the 
first n odd integers. For example, 4 2 = 1 + 3 + 5 + 7. 

Why this is true is visually obvious if you just think about 16 stones 
arranged in a square array (see Figure 1.3). Remove 1 stone from, say, the 
top right-hand corner, then remove the next 3 stones in an L-shaped 
pattern from the top right, then the next 5 again in an L-shaped pattern, 
and then the final 7 remaining stones in this same pattern. 
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Figure 1.3 1+3 + 5 + 7 = 16. 


Fibonacci used this basic fact in a clever way on the triangle problem. 
In this approach to the problem he concentrates on finding three 
squares in an arithmetic progression with common difference 720. He 
factors 720 into 8 • 90, and writes 720 as a sum of 8 consecutive odd 
integers whose average is 90 (that is, 90 is in the exact center of this 
sequence of odd numbers): 

720 = 83 + 85 + 87 + 89 + 91 + 93 + 95 + 97. 

Then he also factors 720 into 10 • 72, and writes 720 as a sum of 10 
consecutive odd integers whose average is 72 (again, 72 is in the exact 
center): 

720 = 63 + 65 + 67 + 69 + 71 + 73 + 75 + 77 + 79 + 81. 


Note that, amazingly, these two sequences match up perfectly in that 
they could be combined into a single sequence beginning at 63 and 
ending at 97. 

Now, it’s just a matter of noticing that 

1 + 3 + 5 + - • • + 97 = 49 2 , 

1 + 3 + 5 + - • - + 81 =41 2 , 

1 + 3 + 5 + - • +61 = 31 2 . 

Thus we know the three squares 31 2 , 41 2 , and 49 2 are in an arithmetic 
progression with common difference 720. 

What made this idea work for Fibonacci is that the first sum has 8 
consecutive odd integers centered on the number 90, and the second 
sum has 10 consecutive odd integers centered on the number 72, and 
furthermore, these consecutive sums fit together perfectly (since 83 
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is the next odd integer after 81). In general then, in order for this 
method of Fibonacci’s to work, we need to be able to factor the common 
difference d in two ways, d = afi = y8 such that a + y = p - S. 
(By the way, we are using the four Greek letters a, p, y, S here simply 
because they are four convenient letters to use; but any four available 
letters would be just as good, such as /', k. e. f.) In the problem above, 
for example, the fact that 8 + 10 = 90 - 72 meant that there were 18 
consecutive odd numbers with the first 10 centered on 72 and last 8 
centered on 90. 

Both Fibonacci and Fermat were well aware of the close connection 
between Pythagorean triangles and squares in arithmetic progression. 
In fact, another way to think of Theorem 1.2 is: it is impossible to have 
three squares in an arithmetic progression whose common difference is a 
square. Fibonacci had made this same assertion long before Fermat, but 
gave a completely inadequate argument to support his claim. 

So we have begun our story of number theory with Fermat because 
modern number theory itself can be said to have begun with Fermat. 
Although numbers had engaged people in creative thought in many 
parts of the world for centuries and even millennia before the time of 
Fermat, he was the one who through his insight, curiosity, and vast 
correspondence set number theory on the path that we still follow 
today. We shall return to Fermat over and over again during the telling 
of our story in this book, but for now, in the next chapter, we again need 
to take a look much further back. Fermat did not invent number theory 
in a vacuum. The ultimate source of Fermat’s ideas concerning numbers 
was the ancient Greeks, and these ideas came forward to him from them 
in a single book, Arithmetica, by Diophantus. 


Problems 

1.1 The Pythagorean triples {3, 4, 5}, {5, 12, 13}, and {7, 24, 25} each 
represent right triangles in which the hypotenuse and one leg differ by 
only a single unit. Prove that there are infinitely many such 
Pythagorean triples by showing that for any odd number 2k + 1, the 
triple [2k + 1, 2k 2 + 2k. 2 k 2 + 2k + 1} is a Pythagorean triple. Pythagoras 
knew of these triangles. Are they always primitive Pythagorean triples? 

1.2 (H,S) Find all solutions in the positive integers to the equation 

x 2 + y 2 = 1003. 

1.3 (H,S) Find two primitive Pythagorean triples that represent triangles 
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having different hypotenuses but equal area. (Fermat proved that for 
any number n there are in fact n triangles with different hypotenuses 
and the same area.) 

1.4 (H) Prove that for any integer n > 3 there is a Pythagorean triangle with 

one of its legs having length n. 

For which integers n will there be a primitive Pythagorean triangle 
with n as the length of one of its legs? 

1.5 (H,S) Prove that the radius of the inscribed circle of a Pythagorean triangle 

is always an integer. 

1.6 (H,S) What is the longest possible hypotenuse a right triangle with integer 

sides can have if the radius of the inscribed circle is 12? 

This problem appeared in the 2007 American Mathematics League 
Competition. 

1.7 ★ (H,S) We say that a set of numbers is pairwise relatively prime if any two 

numbers in the set are relatively prime; in other words, for every pair of 
numbers from the set the only common factor of both numbers is 1. It 
is obvious that if a set of numbers is pairwise relatively prime, then the 
only common factor of all the numbers in the set is 1. 

However, the converse of this statement is not true. Find a 
counterexample by finding a set of three numbers (a, b , c } whose only 
common factor is 1, and yet no pair of these numbers is relatively 
prime. 

Then determine whether a primitive Pythagorean triple is always 
pairwise relatively prime, and prove this one way or the other. 

1.8 (H,S) Neugebauer called numbers that have no prime factors other than 2, 

3, and 5 regular numbers. Regular numbers are therefore the numbers 
whose reciprocals can be expressed as finite sexagesimal fractions. For 
example, the reciprocal of 3 is an infinite decimal fraction 
(| = 0.333 . . . ) but is finite as a sexagesimal fraction 
(! = 0 . 20 , 0 , 0 , . . . ). 

Express the reciprocal of 75 both as a decimal fraction and as a 
sexagesimal fraction. Then express the reciprocal of 7 both as a decimal 
fraction and as a sexagesimal fraction. 

1.9 (S) Here are the values of s and t that correspond to the fifteen rows of 

Plimpton 322. (No values are given for row 11 because that particular 
row contains the triangle {45 , 60, 75} instead of the triangle {3, 4, 5} 
for which s — 2 and t = 1).) 
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Row 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

12 

13 

14 

15 

s 

12 

64 

75 

125 

9 

20 

54 

32 

25 

81 

48 

15 

50 

9 

t 

5 

27 

32 

54 

4 

9 

25 

15 

12 

40 

25 

8 

27 

5 


Note that a few values seem to be missing such as s = 5, t = 2, 
which give triangle {21, 20, 29}. Perhaps this triangle was left out 
because a = 21 is not the shortest side — in other words, the angle of 
43.60 degrees between a and the hypotenuse is less than 45 degrees. 

The values s = 6, t = 5 are also missing and would give triangle 
{11, 60, 61}. Maybe this triangle was left out because a is too 
short — that is, the angle 79.61 degrees is too big. The question 
remains: why does Plimpton 322 contain these fifteen triangles and no 
others? 

One reasonable hypothesis to support Neugebauer’s answer would 
be that Plimpton 322 is a list of all the triangles you would get for all 
regular s and t less than or equal to 125 (this number is chosen because 
s = 125 does occur behind the scenes in the fourth row) and assuming 
that the larger of the two acute angles is supposed to range from about 
45 degrees in the triangle at the top of the list to almost 60 degrees at 
the bottom. Do you find this a plausible explanation for Plimpton 322? 
Support your answer. 

1.10 * (S) Formulas such asl + 3 + 5+-+ (2 n - 1) = n 2 (which we “proved” 
geometrically in the text) can be proved algebraically using a method 
that is very much like Fermat’s method of infinite descent except that 
it works in the other direction. The method is fundamentally simple: 
you prove the formula for a small value such as n = 1 , and then you 
prove that whenever the formula is true for one value n it is also true for 
the next value n+ 1 . That’s all there is to it. The formula is then true for 
all values of n; it is true for n = 1 , so then it must be true for the next 
number n = 2, and for the next number n = 3, and the next number 
n = 4, and on, and on, forever. 

This method could appropriately be called the method of infinite 
ascent, but it was given the name the method of induction by Augustus 
De Morgan in 1838, and first used by Blaise Pascal in 1654. 

Use induction to prove that n 2 is the sum of the first n odd integers 
by assuming that 


1 + 3 + 5 + - • + (2n- 1) = n 2 , 
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and then showing algebraically that the formula holds for n + 1: 

1 + 3 + 5 + • • ■ + (2 n + 1 ) = (« + l) 2 . 

Don’t forget to show that the formula is true for n = 1 . 

1.11 ★ (H,S) Let’s look at Problem 1.10 again, and make sure we see exactly how 
the key inductive step is working. By the inductive step we mean the step 
in the proof where you show that if the formula is true for one value n, 
it is also true for the next value n + 1. So, in this problem we are going 
to isolate the inductive step. 

Assume that the formula in Problem 1.10 is true for n = 49, that is, 
assume that 


49 2 = 1 + 3 + 5 + 7 + ■ ■ • + 93 + 95 + 97. 

Then use this assumption to prove that the formula is also true for 
n = 50, that is, prove that 

50 2 = 1 + 3 + 5 + 7 + • • ■ + 93 + 95 + 97 + 99. 

1.12 ★ (H,S) (a) Use induction to prove the following formula for the sum of 

the first n squares: 

l 2 + 2 2 + 3 2 + ■ ■ • + n 2 = - (W + I)(2n+1 ) . 

6 

We will give another proof of this formula in Chapter 2. 

(b) It turns out that there is only one positive integer n (other 
than 1) such that the sum of the first n squares is itself a square. 
Use this formula to find that integer. 

You can find a very nice geometric proof of this formula in 
“Counting Squares to Sum Squares” by Duane W. DeTemple, The 
College Mathematics Journal 41(2) (May 2010), 214-19. Here the idea is 
that the left side of this formula represents the total number of squares 
that can be found inside an n x n grid. 

1.13 ★ (H,S) You should think about prime numbers the same way we think 

about atoms. They are the building blocks for the rest of the integers. 
And it is no accident that both the concept of prime number and the 
concept of the atom come down to us from the ancient Greeks. 

Modern theories of the atom can be traced back to the fifth-century 
B.C. Greek philosophers Democritus and Leucippus, who proposed 
that all matter is made up of very small indivisible particles. 
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In number theory this same notion can be traced back to the very 
same time and place. The Pythagoreans divided the natural numbers 
greater than 1 into two kinds: the indivisible numbers, called prime 
numbers, and the composite numbers, which are numbers that can be 
written as a product of two smaller numbers. 

Prove that any integer greater than 1 can be written as a product of 
one or more primes. That is, prove that any integer greater than 1 has a 
prime decomposition. 

1.14 ★ (H) In the proof of Theorem 1.1, we used the fact that if we divide a 

square by 4 the remainder will be either 0 or 1 — that is, it will never be 
2 or 3. And we express this fact by saying that a square n 2 must have the 
form 4k or 4k + 1, and never the form 4k + 2 or 4k + 3. 

Prove that if we divide a cube by 9, the remainder will be either 0, 1, 
or 8 — that is, it will never be 2, 3, 4, 5, 6, or 7. In other words, show that 
a cube n 3 must have the form 9k, 9k + 1 , or 9k + 8. 

1.15 ★ (S) In his letter of 1659 to Huygens, in which he reported having 

discovered his method of infinite descent, Fermat wrote: “At first I used 
it only to prove negative assertions such as: No number of the form 
3n - 1 can be written as x 2 + 3v 2 .” This is a somewhat puzzling 
statement because infinite descent is not needed at all to prove such an 
easy result. Prove this result by using the idea that any integer must 
have one of three forms: 3 n, 3n + 1, or 3 n + 2. (Note that when Fermat 
talks about a number having the form 3n — 1 that is equivalent to 
saying it has the form 3 n + 2.) 

1.16 (H,S) The fact that a square must have the form 4 n or 4 n + 1 immediately 

implies that no number of the form 4n + 3 can be the sum of two 
squares. Still, it might be possible for such a number to be written as 
the sum of three squares. However, it is easy to check that the number 7 
cannot be written as a sum of three squares, and that in fact 7 requires 
four squares: 7 = 4 + l + l + l.In 1638, Fermat wrote to Mersenne that 
no integer of the form 8 n+7 can be written as the sum of three or 
fewer squares, and that this remains true even if you use rational 
squares. Mersenne passed this correspondence on to Descartes, who 
was quite disdainful that Fermat had announced such a trivial result. 

Give a proof of this result for integers by first proving that any square 
n 2 must have the form 8k, 8k + 1, or 8k + 4. Then prove that the result is 
also true for rational squares. A rational number is a number that can be 
written as a fraction f where a and b are integers, and in this case we 
would call (|) 2 a rational square. 
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1.17 (H,S) Show that if a. b , c are three positive integers such that 

a 3 + b 3 = c 3 , 

then one of the three numbers must be divisible by 7. 

1.18 ★ (H,S) Fermat would eventually prove that a cube can never be written as 

the sum of two cubes. This is one case of his famous conjecture known 
as Fermat’s last theorem. Flowever, it is possible for a cube to be the sum 
of three cubes. Find two different solutions in the positive integers to 
the equation 

a 3 + b 3 + c 3 = d 3 . 

Such a solution {a, b, c, d } is the cubic analog of a Pythagorean triple 
and is therefore called a cubic quadruple. One of these cubic quadruples 
should both surprise and delight you. 

1.19 (H,S) Fibonacci was once challenged to find three squares in arithmetic 

progression with common difference 5, that is, to find three rational 
numbers a, b, and c such that b 2 -a 2 = c z -b 2 = 5. Solve this problem. 
A rational number is a number that can be written as a fraction - where 
r and mare integers. 

1.20 (S) Fibonacci picked the values s = 5 and t = 4 to find three squares in 

arithmetic progression with common difference 720. Try several other 
values of s and t such as s = 5 and t = 2, or evens = 2 and t = 1, to see 
what other arithmetic progressions with three squares that you come 
up with. 

1.21 (H,S) Fibonacci used his method to find three squares in arithmetic 

progression with common differences other than 5. Use his 
method— as did Fibonacci himself— for the number 7 by taking s = 16 
and t = 9. That is, find three squares in arithmetic progression with 
common difference 7. In particular, explicitly use two factorizations of 
the common difference d to find the three squares by expressing each 
occurrence of the common difference between the squares as a sum of 
consecutive odd numbers. 

1 .22 The Babylonians were not the only ones to do computations in base 60. 
The decimal system we use now only began to become known in 
Europe during the late Middle Ages. In particular, astronomers 
routinely used base 60 for their calculations. In 1483, a book of 
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astronomical data called the Alphonsine tables was published in Spain 
(named for Alfonso X of Castile, who commissioned the work). 

The Alphonsine tables contain an amazingly accurate estimate of 
the length of a year, one that is within a few seconds of the current 
estimate. This estimate, written in base 60, was 

365.14, 33, 9, 67 

for the number of days in a year. Our current calendar is based on 
365 + ^ as a fairly close estimate for the number of days in a year. 

Show that the first two “digits” of the Alphonsine estimate — that is, 
14, 33 — is exactly equal to the number ^ that we use for our calendar 
today. 


Euclid 


Greek Mathematics 

The source of modern number theory in terms of both its content and 
the manner in which we pursue it lies in the mathematics of the Greek- 
speaking people who lived throughout the eastern Mediterranean re- 
gion in various independent city-states for more than a thousand years 
from roughly the sixth century B.C. onward. 

One of the most well known of these ancient Greek mathematicians 
is the sixth-century B.C. philosopher Pythagoras, who was born on the 
island of Samos off the coast of present-day Turkey. Little is known of his 
life but he did travel widely and eventually settled in southern Italy with 
a group of followers we now call the “Pythagoreans.” Pythagoras and 
his brotherhood are today perhaps best remembered for a philosophy 
that placed number at the very center of everything. Nicomachus, 
writing in about A.D. 100 in his Introduction to Arithmetic, captures this 
Pythagorean view of the universe: 

All that has by nature with systematic method been arranged in 
the universe seems both in part and as a whole to have been 
determined and ordered in accordance with number, by the 
forethought and the mind of him that created all things; for the 
pattern was fixed, like a preliminary sketch, by the domination of 
number preexistent in the mind of the world-creating God. 

Pythagoras has been credited with discovering the simple relation- 
ship in music between harmony and number— that is, between the 
length of a string (or, as legend has it, the size of a blacksmith’s anvil) 
and the pitch produced, so that a 2:1 ratio between the lengths of 
two strings will produce an octave difference in pitch, a 3:2 ratio will 
produce a “perfect fifth” (such as a G and a D, for example), and 
so on. 

The Pythagoreans have also been credited with many discoveries in 
mathematics for which there is little evidence. The Pythagorean theo- 
rem is one famous example. Another is an often told story that it was 
the Pythagoreans who first discovered that V2 is an irrational number, 
which was extremely upsetting to them. A particularly dramatic version 
of this story, and one that is reminiscent of a famous scene in The 
Godfather Part II, has a group of irrate Pythagoreans row the unfortunate 


Euclid 


27 


o o o o 

o o o o o o 
o o o o o o 
o o o o 


o 

o o 
o o o 
o o o o 
o o o o o 


Figure 2.1 Triangular arrays. 


fellow who made this discovery out to the middle of a lake and drown 
him in order to keep the discovery secret. 


Triangular Numbers 

The Pythagoreans did attach special properties to numbers. The num- 
ber 2 represented man, 3 represented woman, and so 5 was marriage; 
4 was justice, 1 was reason. Rather strangely, even numbers were con- 
sidered feminine and odd numbers masculine. 

More important for us, the Pythagoreans saw geometric properties 
in numbers. So, in addition to square numbers — after all, they were 
aware of the formula n 2 = 1 + 3 + 5 + • • • + (2 n - 1) mentioned in 
the last chapter — that corresponded to square arrays of stones placed 
on the ground, they also studied triangular numbers, that is, numbers 
that represent numbers of stones that can be placed in triangular arrays 
on the ground. The triangular numbers, then, are 1, 3, 6, 10, 15, ... , 
as shown in the triangular arrays in Figure 2.1. 

The nth triangular number t n is by definition the sum of the first n 
natural numbers, that is, 


t n — 1 + 2 + 3 + • • • + 77 . 

Furthermore, since we observe in Figure 2.1 that you can get from one 
triangular number to the next by adding a single row of stones, it is 
obvious that 


tn = f«-l + n. 

For example, f s = h + 5; that is, 15 = 10 + 5. 

Another simple and extremely important formula for triangular 
numbers was known to the Pythagoreans. You can take two copies of 
the triangular array for a given triangular number t„ and place them 
together to form an n x ( n + 1) rectangle. Let’s do this for the triangular 
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= 15 

from 

Figure 

1 2.1, and we see that we get the 5x6 


rectangle shown in Figure 2.2, which of course has 30 stones. In general, 
the two triangles will form a single n x (n + 1) rectangle with n(n + 1) 
stones. Therefore, 


, _ n(n + 1) 

" - 2 ' 

The great early nineteenth-century mathematician Carl Friedrich 
Gauss will be mentioned frequently in this book, and is even the main 
topic in Chapter 8, but one story about Gauss is worth telling now 
because it has to do with triangular numbers. Here is how the story 
goes. When Gauss was a young boy, about eight or so, his teacher 
one day gave his class the following problem in order to keep them 
occupied for awhile: add all the numbers from 1 to 100. Much to the 
teacher’s surprise, Gauss did this in just a moment or two, presumably 
by noticing that 1 + 100 = 101, 2 + 99 = 101, 3 + 98 = 101, and so 
on, and in this way he got fifty identical sums, each being 101, so the 
total sum is 50(101) = 5050; that is, 1 + 2 + • • ■ + 100 = f^ 101) . In 
other words, as a boy, Gauss discovered the formula we gave above for 
triangular numbers! To be honest, there does seem to be some doubt 
as to whether this event actually took place. Nonetheless, this story of 
Gauss’s early childhood genius, much like the famous but apocryphal 
story of the apple that fell on Isaac Newton’s head, is now firmly rooted 
in mathematical folklore. By the way, in Problem 2.1 you will be asked 
to give a third proof of this extremely important formula for triangular 
numbers. 

Another very nice fact about triangular numbers that was undoubt- 
edly known to the Pythagoreans, but is usually attributed to Nico- 
machus, is that if you add any two consecutive triangular numbers, you 
get a square number. So, for example, 6 + 10 = 16, and 15 + 21 = 36. 
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Figure 2.3 q + f 5 = 5 2 . 

In other words, 
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t„-i + t„ = n 2 . 


This fact can be made visually obvious simply by placing two triangular 
arrays together to form a square, as we see illustrated in Figure 2.3. You 
will be asked for an alternative proof in Problem 2.2. 

A much less obvious fact about triangular numbers that was also 
known to the Pythagoreans, though attributed to Plutarch in about 
A.D. 100, is that eight times a triangular number plus one always yields 
a square. So, for example, 8 • 6 + 1 = 49, and 8 • 15 + 1 = 121 = ll 2 . This 
fact is easy to prove algebraically since we can write 

8 • t n + 1 = — — + 1 = An 2 + 4/z + 1 = (2 n + l) 2 . 

You will be asked to provide a visual — and, we hope, quite beautiful — 
geometric proof of this fact in Problem 2.4 using ideas similar to those 
in Figures 2.2 and 2.3. 


Tetrahedral and Pyramidal Numbers 

The geometrical properties of numbers studied in ancient times were 
not limited to two dimensions. Even in modern times, as you travel 
around the world you see fruit and produce stacked in geometric pat- 
terns in markets or by the roadside. For centuries cannon balls have 
been stacked in similar geometric patterns, and today, golf balls are 
often set out in exactly the same way on driving ranges. 

As we see in Figure 2.4, humans seem to have decided that there 
are two natural ways to stack things: either they start with a triangular 
base, or they start with a square base. Since we have been talking about 
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Figure 2.4 (a) A tetrahedron of oranges, (b) A pyramid of golf balls. Figure 2.4(b) 
is courtesy of Sgame/Shutterstock. 
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Figure 2.5 7" 3 = 1 + 3 + 6 = 10. 
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triangular numbers, we will focus for the moment on the triangular base 
option. The oranges in Figure 2.4a form a tetrahedron (that is, a solid 
figure with four equal equilateral triangular faces). Note how perfectly 
the top orange nestles into the single space formed by the triangle of 
three oranges in the second layer of the tetrahedron, and how these 
three oranges similarly fit into the three spaces formed by the triangle 
of six oranges in the third layer, and so on. Figure 2.5 illustrates the 
idea that the number of oranges in each layer of such a tetrahedron is 
represented by a triangular number. 

Since we could make larger and larger stacks of oranges by placing 
ever larger triangular layers at the bottom, we will define the / 7 th tetra- 
hedral number T„ to be 


T n — t\ + ti + ■ • • + t n , 

that is, the nth tetrahedral number is the sum of the first n triangular 
numbers (the layers of the tetrahedron). 

For example, T 3 = t\ + t 2 + h = 1 + 3 + 6, as illustrated in Figure 2.5. 
And Ti would be 1 + 3 + 6 + 10 = 20 (so there are 20 oranges in the 
picture). The tetrahedral numbers form the sequence 
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Note that each number in this sequence increases by the next triangular 
number, that is, 


T n — T n _\ + t„. 

A striking formula that was known in Egypt about 300 B.C., and was 
also found in India by Aryabhata around A.D. 500, involves the sum of 
the first n triangular numbers: 


h + k + ■ ■ ■ + t n 


n(n + 1 )(n + 2) 
6 


In other words, this is a formula for the nth tetrahedral number. For 
example, T 6 = 6(7 ^ 8) = 56. In Problem 2.6 you will be asked to prove 
this formula using induction. 

There is also a very clever geometric proof of this formula, which we 
will describe using the tetrahedral number 20 as an example to illustrate 
the general idea. You might want to use actual objects such as oranges — 
balls of clay would be ideal — to help visualize what is going on in three 
dimensions. 

We begin with the tetrahedral number 20, and represent it as the 
following triangle of numbers: 


1 


1 2 


1 2 3 


12 3 4 

where the sum of each row is a triangular number; thus, 20 = 1 + 3 + 
6 + 10 . 

Now if we take three copies of this triangle of numbers (each triangle 


still representing the tetrahedral number T 4 
three triangles, 

= 20), and “add" these 

1 

1 

4 

1 2 

2 1 

3 3 

1 2 3 

3 2 1 

2 2 2 

12 3 4 

4 3 2 1 

1111 
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then we get the following triangle of numbers 

6 


6 6 


6 6 6 


6 6 6 6 

and we conclude that 3 • T 4 = 6 • ft, that is 3 • 20 = 6 ■ 10. 

If we do this in general for the nth tetrahedral number we will 
get 


3 T n — (n + 2)t n . 


Solving for T„ gives us Aryabhata’s formula for the nth tetrahedral 
number 


rr n + n + 2 n(n+ 1) n(n+l)(n + 2) 

” “ ~^r tn ~ ~3 2 = 6 • 

Turning now to stacks formed by starting with a square base, as in the 
stack of 30 golf balls in Figure 2.4b, we define the nth pyramidal number 
P„ to be 


Pn = l 2 + 2 2 + • ■ ■ + n 2 , 

that is, the nth pyramidal number is the sum of the first n square 
numbers (the layers of the pyramid). 

For example, P 4 = 1 + 4 + 9 + 16 = 30, as illustrated by 
the stack of golf balls. The pyramidal numbers form the sequence 
given by 


1, 5, 14, 30, 55, 91, . . . . 

Note that each number in this sequence increases by the next square 
number, that is, 


Pn — Pn - 1 + n 2 . 

Since each layer of a pyramid is represented by a square k 2 , 
we can use the formula k 2 = ft_ x + ft to find a formula for P n , 


Euclid 


33 


as follows: 

Pn = l 2 + 2 2 + 3 2 + 4 2 + • • • + n 2 

= h + (h + h) + {h + £3) + (£3 + £4) + • • • + (f»-i + t„) 

= (h + tz + £3 + U + • ■ • + £„) + (t\ + t2 + £3 + • • • + ffj-l) 

W(H+1)(H + 2) (« - l)n(« + 1) 

= £« + i n _ 1 = 1 

6 6 

_ n(n + l)(2n + 1) 

= 6 ' 

You proved this same formula by induction in Problem 1.12. 

Let us now turn our attention from the specific content of the 
ancient Greek study of numbers to the manner in which they pursued 
mathematical truth. 


The Axiomatic Method 

During roughly the period from the sixth to the fourth century B.C., 
there developed within Greek mathematics the notion of proving 
things. This Greek notion of proof is the foundation upon which 
modern mathematics rests. The very way in which we go about doing 
mathematics today is something we inherited from the Greeks: the use 
of deductive reasoning to prove an ever-growing body of new facts from 
previously known facts, and the realization that you have to start with 
a set of simple “facts” called axioms that are taken as self-evidently true. 

When Thomas Jefferson wrote the Declaration of Independence of the 
Thirteen Colonies in June of 1776 he adopted this very same axiomatic 
method : 

We hold these truths to be self-evident, that all men are created 
equal, that they are endowed by their Creator with certain 
unalienable Rights, that among these are Life, Liberty, and the 
pursuit of Happiness. 

Thus, in this way and in this document Jefferson establishes individual 
liberty as the bedrock upon which his argument calling for colonies to 
break with England stands. Jefferson's Declaration of Independence owes 
its form to the ancient mathematics of the Greeks. 

Euclid is justifiably one of the most famous Greek mathematicians 
of all — only Archimedes and Apollonius can compare. Little is known 
of Euclid's life. He lived most of it in Syracuse, in Sicily, but may have 
taught in Alexandria — the great center of learning on the north coast 
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°f Egypt — around 300 B.C. However, Euclid wrote a text consisting of 
thirteen books that contained definitions, axioms, and theorems and 
brought together most of the mathematics that was known at that time. 
Euclid's Elements would go on to be not only the most wildly successful 
mathematics book ever written, but it is rivaled only by the Bible in 
terms of overall circulation and widespread influence. The Elements was 
one of the first books to be printed after the invention of the printing 
press. Since its first printing in Venice in 1482 more than a thousand 
editions of the Elements have been published. The first English edition 
appeared in 1570, and unfortunately its title page repeats a common er- 
ror by confusing Euclid with an earlier philosopher, Eucleides of Megara 
(thus, in Latin, Euclid was often referred to as Euclidis Megarensis). 

The very title of this great work tells us that it contains the 
“elements” from which all further mathematical truth can be pro- 
duced. Euclid begins with definitions and self-evident truths (axioms) 
and then uses deductive reasoning to produce an ever-growing body of 
new results (theorems and propositions) that are also true— all results 
flow logically from the initial elementary starting point. This is the form 
that Thomas Jefferson adopted in his Declaration of Independence. 

Most of the truths in the Elements deal with geometry— for example, 
Proposition 47 in Book I is the “Pythagorean Theorem”— but three of 
the books (VII, VIII, and IX) deal with number theory. For instance, 
Proposition 2 in Book IX is the statement that there are infinitely many 
prime numbers. 

Proposition 20. Prime numbers are more than any assigned 
multitude of prime numbers. 

Euclid's proof of this fact is frequently cited as one of the most beautiful 
in all of mathematics. There is, however, no real evidence one way or the 
other that this proof can be attributed to Euclid (not that this detracts 
in the least from its beauty). The great early twentieth-century English 
mathematician G. H. Hardy called this proposition of Euclid’s and the 
proof that s/2 is irrational “theorems of the highest class,” and wrote 
that “each is as fresh and significant as when it was discovered— two 
thousand years have not written a wrinkle on either of them.” 

Here is Euclid’s proof. Euclid thought of numbers as representing line 
segments of various lengths. An explanatory remark has been added to 
his proof in brackets. 

Let A, B, C be the assigned prime numbers; I say there are more 
prime numbers than A, B, C. For let the least number measured 
by A, B, C be taken, and let it be DE; let the unit DF be added to 
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Figure 2.6 Title page of first English edition of Euclid's Elements, 1570. 

DE. [When Euclid says “measured” we would today say 
“divisible,” and so the number DE is the product of the three 
primes A, B, and C and then Euclid adds 1 — that is, the length of 
DF — to DE to get the number E F .] 

A — 

B — 

C 


D 



Then E F is either prime or not. First, let it be prime; then the 
prime numbers A, B,C, EF have been found which are more 
than A, B, C. 
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Next, let EF not be prime; therefore it is measured by some 
prime number. Let it be measured by the prime number G. 

G 

1 say that G is not the same with any of the numbers A, B, C. 

For, if possible, let it be so. Now A, B, C measure DE; therefore G 
will also measure DE. But it also measures EF. Therefore G, being 
a number, will measure the remainder, the unit DF , which is 
absurd. Therefore, G is not the same with any one of the numbers 
A, B,C. And by hypothesis it is prime. Therefore the prime 
numbers A, B,C, G have been found that are more than the 
assigned multitude of A, B, C. Q.E.D. 

Paul Erdos, the great twentieth-century mathematician whom we 
mentioned in the first chapter, doubted the existence of God but be- 
lieved deeply in something he called the Book. For him the Book was 
something that did exist and was even almost spiritual, for it “contains 
the best proofs of all mathematical theorems, proofs that are elegant 
and perfect.” It is clear that Euclid’s proof of the infinitude of the primes 
comes straight from the Book. 

These days, we dress up Euclid’s proof in contemporary clothing, but 
we can’t pretend to improve it. 

Theorem 2.1 . There are infinitely many prime numbers. 

Proof 

Suppose this is not the case, and that p\. p 2 p n are all the primes. 

Then let 


N= plp 2 p3 ■ ■ ■ Pn + 1 . 

Now N cannot be prime because it is larger than all of the primes 
pi. p 2 , ... , p„- So N is divisible by a prime p. However, none of the 
primes pi, p 2 , . . . , p n can divide N since it would then also divide 1 
(this is because 1 = N - p\p 2 p 3 ■ ■ ■ p n ). Hence we have a contradiction 
because p is a prime other than pn p 2 , . . . , p„, which from the start 
were assumed to be all the primes. This completes the proof. ■ 

One change we made from Euclid in the proof of Theorem 2.1 is that 
instead of beginning with just three prime numbers A, B, C, we began 
in a style that is considered to be more “general” with an unknown 
number of primes pi, p 2 , , p„- That’s just the way we write proofs 

these days even though in this case the gain is minimal. It is clear in 
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Euclid’s proof that the set of primes A, B, C represent “any assigned 
multitude of prime numbers” since their “assigned multitude” being 3 
has nothing at all to do with anything that happens in the proof. There 
is even a disadvantage to the modern general style in that we are forced 
to use subscripts to list the prime numbers, and this has the effect of 
cluttering things up somewhat. Nonetheless, it is important to learn 
this modern style of proof because it has the huge advantage of greatly 
clarifying situations that are far more complicated than this particular 
proof is. 

Proof by Contradiction 

Another change we made in Euclid’s proof is that we explicitly used 
“proof by contradiction.” Euclid’s proof has a different structure be- 
cause he is proving a different statement — namely, that given any 
finite set of prime numbers, there is another prime number not in 
that set. Hence the set of all prime numbers cannot be finite. Euclid, 
however, does use contradiction — also known as reductio ad absurdam — 
within his proof to show that the prime G is not one of the numbers 
A, B, C. 

It is worth pausing to see how proofs by contradiction work. The 
method of contradiction — which G. H. Hardy called “one of a mathe- 
matician’s finest weapons” — can be traced back to the Eleatics, that is, 
to Parmenides of Elea and his followers. The Eleatics, contemporaries 
of the Pythagoreans in southern Italy in the fifth century B.C., empha- 
sized pure reason and logic in their philosophy. Consider the way in 
which we start our proof of Theorem 2.1. What we say in effect is: either 
the number of primes is finite or the number of primes is not finite. 
Similarly, in his proof Euclid says in effect: either G is one of the numbers 
A, B, C, or not. The way contradiction works is to begin with a statement 
of pure reason, such as these, that clearly lays out two alternatives (only 
one of which is going to be true, the other therefore being false); next, 
you simply eliminate one of the two alternatives. What you are left 
with then must be true. What we did in our proof was eliminate the 
alternative that the number of primes is finite. What Euclid did in his 
proof was eliminate the alternative that G is one of the numbers A, B,C. 
The method is called contradiction, or reductio ad absurdum, because the 
way you eliminate an alternative is to assume that alternative, and then 
reach a contradiction — that is, a point of absurdity. 

Sherlock Holmes was a firm advocate of the power of deductive 
reasoning — “that true cold reason which I place above all things” — 
and he frequently lectured his companion Watson on the virtues of 
an indirect proof. Here he is in 1890 speaking to Watson in the second 
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Sherlock Holmes novel, The Sign of the Four : 

How often have I said to you that when you have eliminated the 
impossible, whatever remains, however improbable, must be the 
truth. 

You will be asked to heed this good advice of Sherlock Holmes in 
Problem 2.17 in order to prove that s/2 is irrational (whether or not that 
sounds improbable to you). 

A good illustration of the use of contradiction occurred in our proof 
of Theorem 1.1 in Chapter 1. Look back at that proof. In order to 
prove that one of the numbers x or y in a primitive Pythagorean triple 
[x, y, z) must be even and the other odd, we eliminated the other two 
possibilities— namely, the possibility that both * and y might be even, 
and the possibility that both might be odd. Sir Arthur Conan Doyle 
and G. H. Hardy have it exactly right: contradiction is indeed “one of 
a mathematician’s finest weapons.” 

Euclid, too, was a great fan of contradiction. He used it often in the 
Elements, and even used it as early as Proposition 6 in Book I to prove 
that if two angles in a triangle are equal, then the triangle is isosceles. 
He begins his proof by declaring that he will show that the two sides 
AB and AC subtending the two equal angles are themselves equal, and 
then goes straight into the contradiction mode by saying “for, if AB is 
unequal to AC, one of them is greater; let AB be greater.” 


Euclid's Self-Evident Truths 

Euclid begins his study of numbers in Book VII with definitions: a 
“unit” is “that by virtue of which each of the things that exist is called 
one,” a “number” then is “a multitude composed of units.” There are 
the expected definitions: an “even number” is “that which is divisible 
into two equal parts”, an “odd number” is “that which differs by a 
unit from an even number”, and a “prime number” is “that which 
is measured by a unit alone”; but there are also a few unexpected 
definitions: an “even-times even number” is “that which is measured 
by an even number according to an even number,” an “even-times odd 
number” is “that which is measured by an even number according to 
an odd number,” and an “odd-times odd number" is “that which is 
measured by an odd number according to an odd number.” 

Then, as he had done in his books on geometry, and with these 
definitions as his starting point, Euclid goes on to arrange all of his 
propositions on number theory in Books VII-IX in a logical order so 
that each proposition can in turn be deduced from previous results. For 
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Figure 2.7 Euclid, shown in a detail of Raphael's The School of Athens, 1509/10. 

example, in Book IX in order to prove Proposition 22, which says that 
the sum of an even number of odd numbers will always be even, Euclid 
uses Proposition 21, which guarantees that a sum of even numbers is 
always even. This is what we mean by the term “the axiomatic method” 
and Euclid’s application of it in the Elements is the first, and still best, 
example of this method. 

In the next chapter we will look at one of the most fundamental 
notions in number theory — divisibility — from precisely this axiomatic 
Euclidean point of view. You may find it somewhat surprising there 
that suddenly we shall be taking such painstaking care over what seem 
to be rather obvious ideas about the simple and familiar notion of 
division. But to forewarn you about some of the subtleties involved 
in such matters we will discuss a theorem that Euclid did not include 
in the Elements, even though it surely was known at the time. This 
is the important theorem that says every integer greater than 1 has a 
prime decomposition that you were asked to prove in Problem 1.13. For 
example, 60 = 2 • 2 • 3 • 5, and 210 = 2-3-5 -7; and it should be clear 
to you that you could factor the number 123 456 789 into a product of 
primes if for some bizarre reason you decide to do so. 
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Theorem 2.2. Every integer greater than 1 is either prime or can be written 
as a product of primes. 

Proof 

This theorem might seem all but self-evident to you. In fact, before we 
give a rigorous proof of the theorem, let's try to see why the theorem 
is true. If n is an integer greater than 1, but not itself prime, then it is 
composite, and we can write n - rs where both r and s are smaller 
than n. The same argument can now be repeated on each of the two 
factors r and s and, in turn, repeated again as many times as needed on 
any subsequent factors that arise. Since at each stage the factors always 
get smaller, this process eventually has to stop, and, when it does, each 
factor is a prime number, and n has in this way been written as a product 
of primes. You should recognize the basic idea here, because this has 
really been just an infinite descent argument. 

This argument, while in some ways getting to the heart of the matter, 
also leaves much to be desired since it boils down to saying that you 
can’t just keep factoring forever, which sounds way too simple. We can 
make this argument more rigorous in a couple of ways. Here is one way. 
Turn it into a proof by contradiction: suppose that there is a composite 
number that cannot be written as a product of primes. Then, there is a 
smallest such composite number, call this number n. But n is composite, 
so we can write n = rs where both r and s are smaller than n but still 
greater than 1. Therefore, by the choice of n, both r and s can be written 
as a product of primes, hence so can n, which is a contradiction. 

Another way to make this proof rigorous is to use a slightly different 
form of induction, sometimes called strong induction. The variation 
from the standard method of induction is a minor one: you prove the 
statement for a small value such as n = 1, and then you prove that 
whenever the statement is true for all values up to and including n, it 
is also true for the next value n + 1. 

In this case, we first observe that the theorem is true for the smallest 
value, namely, n = 2, since 2 is prime. Now, assume that the theorem is 
true for all values up to and including some integer n where n > 2. We 
must prove that the theorem is true for the integer n+ 1 . If n+ 1 is prime, 
then we are done. If, on the other hand, n + 1 is composite, then we can 
write n + 1 = rs where 1 < r, s < n + 1. But then, 1 < r, s < n and so, 
by our assumption, r and s can each be written as a product of primes, 
hence so can n + 1 . This completes the proof. ■ 

When reading the proof of this theorem, you might have paused 
momentarily at the point where we say that if there is a composite 
number that cannot be written as a product of primes, then there 
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must be a smallest such composite number. But the idea being used 
there is a fundamental one. All we are really saying is that any set of 
positive integers — in this case, it is the set of composite numbers that 
cannot be written as a product of primes — has to have a smallest ele- 
ment. This fact is self-evident, and even has a name: the well-ordering 
principle. 

The well-ordering principle is the statement that every nonempty set of 
natural numbers contains a least element. This fundamental property 
of the natural numbers is frequently taken as an axiom, and then used 
to establish other properties of natural numbers and integers. 


Unique Factorization 

What is not at all obvious — and in fact requires careful proof, which we 
will be able to do in the next chapter — is that for a given number such 
as 107 207 100 there is only one way to write this number as a product 
of primes, namely, 107 207 100 = 2 2 ■ 3 2 • 5 2 • 7 2 • 11 • 13 • 17. 

You may think this is obvious, but your only argument in support 
of this position at this point would be something along the lines of 
“well, how else could you possibly factor 107 207 100?” This is not 
an argument that would have impressed Euclid. In fact, in order to 
convince you how weak this argument is, we will now look at a number 
system where this argument fails completely. So, temporarily, we are 
going to step outside of the natural numbers and consider complex 
numbers of the form a + feT- 6 where a and b are integers. 

The first thing to say about this larger number system is that it is 
closed under addition and multiplication, which just means that if we 
add two numbers in the system, their sum will again be a number in the 
system 


(2 + 37=6) + (1 - 27=6) = 3 + 7=6, 

and if we multiply two numbers in the system, their product will again 
be a number in the system 


(2 + 37=6) • (1 - 27=6) = 2 - 47=6 + 37=6 - 6(— 6) = 38 - 7=6. 


The next thing we need to do in this number system is have a way 
to decide whether a number a + i?7-6 is prime. For example, we see 
above that the number 38 - 7-6 is composite because it factors into 
a product of two smaller numbers. But what about 10 + 37-6? Is it 
prime or composite? Well, we could try to factor it; after all that’s what 
we do in the integers, but that is much more cumbersome here, and 
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besides, there are many more potential factors that we would have to 
try. A much better idea is to reduce this question to a different, and 
easier, question. Here is how we do this. 

We will define a function on this number system specifically for 
this purpose, called the norm function. This norm function will map 
numbers from this number system to the integers, and will have the 
useful property that it reduces questions of whether numbers such as 
10+3 v^6 are prime to vastly simpler questions of whether their images 
are prime in the integers. 

So, for a number a + b-J ^ 6 in our number system, we define the norm 
of the number to be 


N(a + bs/^6) = a 2 + 6b 2 . 

Note that, as advertised, the norm of a number is an integer. 

Now, the key property of the norm function is that it respects mul- 
tiplication between these two number systems, that is, if a = a + 
b\f--6 and p = c + ds/^6 are two numbers in our number system, 
then 


N(a/3) = N(a)N(p). 

You will be asked to prove this fundamental property of the norm 
function in Problem 2.19. 

Let’s see how the norm can be used to show that 2 + *J~^6 is prime. 
Suppose that 2 + = up, where a = a + b^6 and p = c + d^ 6. 

Then, since N( 2 + V-6) = 2 2 + 6 • l 2 = 10, when we apply the norm to 
both sides of 2 + J^6 = up, we get 

10 = N(ap) = N(a)N(p), 

where N(u) = a 2 + 6 b 2 and N(p) = c 2 + 6 d 2 . But now recall that this is 
all taking place inside the integers, and we know that the only factors of 
10 are 1, 2, 5, and 10. At this point it is easy to see that neither a 2 + 6b 2 
nor c 2 + 6d 2 can ever equal either 2 or 5, so the only option is that one 
of them equals 1, and the other equals 10. Thus we conclude that either 
a — ±1 or c = ±1; that is, either a = ±1 or p = ±1. In other words, 
2 + V- 6 is prime. 

Not surprisingly we could use this same procedure with the norm to 
show that the number 2 - V-6 is also prime in this number system. 
We can even use this procedure to show that the numbers 2 and 5 are 
prime in this system. (Note that numbers such as 2 and 5, which are 
prime as integers, still need to be shown to be prime in this new system 
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because there are simply more numbers present as potential divisors in 
this number system. Perhaps we should also point out the important 
detail that the numbers 2 and 5 are actually in this new number system 
since we can write 2 = 2 + 0- 7—6 and 5 = 5 + 0- 7-6.) 

Now, we are finally prepared to produce two different prime factoriza- 
tions for a single number: 

10 = 2-5 and 10 = (2 + 7^6) • (2 - V^6). 

Note that the second factorization is a direct response to a question you 
never would have asked yourself because the answer would have seemed 
completely obvious (although wrong, as it turns out): “how else could 
you possibly factor 10?” 

The moral of the story is that some number systems have unique 
factorization, and some don’t. For those number systems that do, such 
as the integers, therefore, we have to prove that they possess this very 
desirable property, and that takes some very careful preparation, which 
we will begin in the next chapter. 

But let’s get back to 10 + 37^6 and decide whether it is prime. 
Suppose that 10 + 37-6 = af, where a — a + i>7- 6 and f — c + d7- 6. 
Since N( 10 + 3-7-6) = 10 2 + 6 • 3 2 = 154, and N(a) = a 2 + 6 b 2 and 
N{f) — c 2 + 6 d 2 , we need to consider the factorization of 154 = 2-7-11. 
First, it is clear that neither a 2 + 6 b 2 nor c 2 + 6 d 2 can equal 2 or 11, so we 
might as well set N(a) — a 2 + 6 b 2 = 7 and N(f) — c 2 + 6 d 2 = 22. 

So we easily get a = ±1, b = ±1 as the only solutions for the first 
equation. Note that there are four solutions here, hence four possible 
values for a. The second equation is a little trickier, but a moment’s 
reflection yields the only solutions: c — ±4, d = ±1, and again this 
means four possible values for fi. And we quickly discover that in fact, 
by choosing a = 1, b= 1 andc = 4, d = -1, we get 

(1 + 7^6) ■ (4 - 776) = 10 + 37—6, 
and so 10 + 37-6 is not prime. 

Note that something quite interesting has emerged here. The ques- 
tion of whether 10 + 37-6 is prime in this number system seems 
to be connected to the kinds of questions involving sums of squares 
we dealt with in the first chapter, since it boiled down to deciding 
whether there were integer solutions to the equations a 2 + 6 b 2 = 7 
and c 2 + 6 d 2 — 22. This turns out to be a familiar pattern in number 
theory, and in mathematics more generally, in that ideas that may at 
first appear unrelated often end up being connected in profound ways. 
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Pythagorean Tuning 

A method of tuning musical instruments that was widely used during 
the medieval period in Europe is known as Pythagorean tuning. This 
tuning method is sometimes still used today for special purposes. It 
gets its name of course from Pythagoras because, as mentioned earlier, 
he is supposed to have been the one who discovered the relationship 
between the length of a musical string and the pitch, or frequency, 
of the sound it produces. As we will see, Pythagorean tuning is very 
much connected to one of the ideas we have just been discussing: prime 
decomposition in the integers. 

Before we look at how to build a twelve-note scale using Pythagorean 
tuning, let’s go over a few basic ideas about music and sound. Perhaps 
the most fundamental concept of all is that of an octave. An octave 
is an interval between two notes where the frequency of one note is 
exactly twice that of the frequency of the other note. This is why two 
notes played an octave apart sound so similar to us. Thus, an octave 
corresponds to a ratio 2:1 (that is, to the rational number f.) Notes 
played two octaves apart would have frequencies whose ratio is 4:1, 
three octaves apart would be 8:1, and so on. 

The next simplest ratio is the ratio 3:2 and, not surprisingly, two 
notes that are played simultaneously whose frequencies have this ratio 
sound especially pleasing to us. This ratio, or interval, is the basis of 
the Pythagorean tuning system. It is astonishing that an entire musical 
system can be constructed from a single rational number §. Here is how 
it is done. 

The idea is to build a twelve-note scale. That is, we wish to construct a 
sequence of twelve notes such that each note is slightly higher in pitch 
than the previous note, and if just one more note were added to the very 
end, also slightly higher, this thirteenth note would be an octave higher 
than the first note in the sequence. This, of course, is the way a piano 
works. If you start at middle C and count twelve notes, the twelfth note 
is B, and the next note puts you at C again, an octave higher than where 
you started (this same point is made about the seven white keys on a 
piano in a song you undoubtedly know from The Sound of Music: “Do, a 
deer, a female deer . . . That will bring us back to Do”). The question is: 
how do you tune all the notes in between? 

This is how you tune the notes using Pythagorean tuning, that is, 
using the basic interval whose ratio is §. By the way, in music theory, 
this interval is called a perfect fifth. You’ll see why shortly, but it is a 
useful term for us. Begin with any note. For the moment we will simply 
call this note rq since we don't care what its pitch is, but it is the first and 
also the lowest note in our scale. It automatically has a ratio with itself 
of I. 
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Next we tune a note that is exactly a perfect fifth higher than ri\ and, 
since it will be the second note to be tuned in our scale, we will call this 
note n 2 . Note that n 2 is not just slightly higher than ti\, but it is quite 
a bit higher since it will now have a pitch ratio of | as compared to n\ . 
Again, n 2 is the second note we tuned, not the second highest note in 
the scale (it will in fact turn out to be the eighth highest). 

This might be a good time to pause and briefly talk about what 
we mean by tuning. If you are tuning a guitar, say, it is the relative 
tuning of the strings that matters; so you begin with one string that 
you hope is roughly in tune, and tune the rest of the strings relative 
to that one string. At the beginning of an orchestra concert, an oboist 
plays a single note, an A, and the players in the orchestra all tune 
their own instruments to that one note. These days, we can measure 
frequency very accurately. That A on the oboe is supposed to be exactly 
440 Hz (hertz is the unit of frequency, that is, cycles per second). So, for 
example, if we tuned our first note ri\ to 440 Hz, then our second note 
n 2 would now be tuned to 660 Hz. 

Now, let’s tune another note. You should have a good guess as to what 
we are going to do. For a third note to tune we want to go exactly a 
perfect fifth higher than n 2 . However, there is a small problem. If we 
do that, this note would have a pitch ratio of (|) 2 as compared to ri\, 
but since | > 2 this note will be more than an octave higher than ri\ . 
In other words, this note is too high. But, that is easy to fix since notes 
that are an octave apart sound almost the same to us: we just bring this 
back down an octave! We do that by dividing its frequency by 2. Thus 
we tune our third note by giving it a pitch ratio of | instead of | , and we 
call this note n 2 . 

We can tune a fourth note by going exactly a perfect fifth higher than 
« 3 , and so the fourth note we tune, n 4 , will have a pitch ratio of \ ■ | = 
Note that || < 2, so n 4 is within an octave of n\ . 

The fifth note tuned again will be an octave too high, but we continue 
in this way always multiplying by |, and, when necessary, also dividing 
by 2 in order to bring notes back into the range of a single octave above 

. Now that we see how the tuning works, let’s be a bit more specific. 

If the first note ri\ is F, one could observe that the first seven notes 
produced in the Pythagorean tuning, in order, are F, C, G, D, A, E, B and 
that on a keyboard these are the seven white keys in an octave: F, G, A, 
B, C, D, E. Of course, another F fits nicely above these seven notes an 
octave above the original F. This, of course, is why we call this an octave 
in the first place. You should also now see why the interval from F to C 
is called a perfect fifth. 

If we now continue with our Pythagorean tuning, we next get the 
five black keys in the order Fjl, Cjl, Gjl, DjJ, AJJ. Now we have reached 
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the point where prime decomposition enters the picture. If we try to 
tune a thirteenth note a perfect fifth above this Aft, we get a note that 
is extremely close to, but not exactly at, an F that is an octave above the 
original F. 

In fact, Pythagorean tuning could never produce an F that is exactly 
an octave above the original F, no matter how many notes we produce. 
Let n k be the ratio of the frequency of the kt h note produced by 
Pythagorean tuning to the frequency of the first note; then we will have 
multiplied the frequency of the first note by (§ ) k ~\ but also divided by 
2 each time we had to bring a note back into the octave range above the 
first note; let r be the number of times we divided by 2. Thus 
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and we see that it is impossible for n k = 2. Why is this impossible? 
Because if n k = 2 for some k, then we get 2 k+r = 3 k ~ l , and we would 
have an integer with two different prime decompositions! (You may see 
other contradictions as well.) 

Pythagorean tuning nonetheless comes pretty close to an octave on 
the thirteenth note. In order to produce the first twelve notes in the 
scale, it turns out you have to bring notes back down an octave six times, 
that is, r = 6. Therefore, 


3 12 


»13 = 


218 


531 441 
262 144 


% 2.027. 


Whether you want to call this close to an octave depends on your 
sense of pitch. It amounts roughly to being able to notice the difference 
between an A at 440 Hz, and an A at 446 Hz; do you think you could 
detect which of these two tones was the higher one? Musicians have 
a poetic term for the ratio of 2.027 to 2, that is, for the number 1.014; 
they call it the Pythagorean comma since it measures the degree by which 
Pythagorean tuning barely misses perfection, perfection being 1. 
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Problems 

2.1 * (S) Use induction to verify the formula t„ = ”- n 2 hl) for triangular 

numbers. 

2.2 ★ Prove the formula 

tn - 1 + tn = n 2 

by using the basic formula t n = 2^^ for triangular numbers. 

2.3 (H) Prove that the difference between the squares of any two consecutive 

triangular numbers is always a cube. For example, 

21 2 - 15 2 = 441 - 225 = 216 = 6 3 . 

2.4 (H) Give a visual, geometric proof of the fact mentioned in the text 

(using ideas similar to those in Figures 2.2 and 2.3) that 8 • t n + 1 is a 
square by showing in a diagram how eight triangular arrays of stones 
together with one extra stone can be arranged into a square array of 
stones. 

2.5 Note that for the tetrahedral numbers T 4 = 20 and T 5 = 35 we have 

20 = l- 4 + 2- 3 + 3- 2 + 4- l and 35 = l- 5 + 2- 4 + 3- 3 + 4- 2 + 51. 

This is true in general; in other words, it is always the case that 

T„ = 1 ■ n + 2 • (n — 1) + 3 • (n — 2) + • • • + {n — 1) • 2 + n ■ 1. 

Give a geometric explanation for why this is true. It might be useful to 
visualize a tetrahedron of oranges. 

2.6 * (H) The formula 


t\ + ti + • • • + t n 


n(n + 1 )(n + 2) 
6 


for the sum of the first n triangular numbers — that is, the formula for 
the nth tetrahedral number T n — was proved in the text using a visual, 
geometric argument. Prove this formula using induction. 
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2.7 (H,S) Nicomachus noticed— but did not prove— a very nice property of 
the odd numbers: 


1 1 

3 5 8 

7 9 11 27 

13 15 17 19 64 


21 23 25 27 29 125 


where if the odd numbers are arranged in a triangular array as shown, 
then the sum of the numbers in the nth row of the triangle is always n 3 . 

An equivalent observation, since there are n numbers in the nth row 
of the triangle, is that the average of the nth row is always n 2 . For 
example, the average for the fifth row is 25, and the average for the 
fourth row is 16. 

Prove this property of the odd numbers. 

2.8 (H,S) Consider the following triangular arrangement of the positive 
integers: 


1 

2 3 


4 

5 

6 


7 

8 

9 10 


11 

12 

13 14 

15 

16 

17 

18 19 

20 21 
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If we add the numbers in the first two odd rows, we get 
1 + 4 + 5 + 6 = 16 = 2 4 . 

If we add the numbers in the first three odd rows, we get 

1 + 4 + 5 + 6 + 11 + 12 + 13 + 14 + 15 = 81 = 3 4 . 

Use induction to prove that, in general, the sum of the numbers in 
the first k odd rows equals k 4 . 

2.9 (H,S) (a) Show that 2701 is a triangular number. 

(b) Devise a test that determines whether a number N is triangular. 

(c) Pick a number at random that is greater than 5000, and 
use your test from part (b) to decide whether this number is 
triangular. 

2.10 (S) Prove that, for each positive integer n, the number 

n(n + 1 )(n + 2 )(n + 3) 

8 


is a triangular number. 

2.11 (S) If we look at the sequence of triangular numbers 

1, 3, 6, 10, 15, 21, 28, 36 

we can’t help but notice that the eighth triangular number, fe = 36, is a 
square, and, of course, so is the first triangular number, t\ = 1. One 
wonders if there are other triangular numbers that are also squares. 

Prove that there are infinitely many square triangular numbers by 
proving that if t„ is square, then Un(n+i) is also square. For example, for 
n= 1, since t\ is a square, then so is 4 i(i+i) = h- Similarly, since f 8 is a 
square, then so is h 8(8+n = t 288 . We can confirm this by computing 
f 2 88 = — ( | 89) = 144 • 289 = (12 • 17) 2 = 204 2 . 

Show, however, that the infinite sequence of square triangular 
numbers produced in this way, fi, t H , t 2 m, ■ ■ ■ , does not produce all 
square triangular numbers by finding a square triangular number not 
on this list. 

This problem appeared in The American Mathematical Monthly 69(2) 
(February 1962), 168. 
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2.12 ★ (S) In addition to square and triangular numbers Nicomachus also 
discussed other numbers with geometric qualities. The pentagonal 
numbers are the numbers 

1, 5, 12, 22, 35, ... 

and the hexagonal numbers are the numbers 

1, 6, 15, 28. 45, ... 

that represent ever larger patterns of pentagons and hexagons, 
respectively, analogous to the pattern of triangles shown in 
Figure 2.1. 

(a) Draw the patterns of pentagons and hexagons that produce the 
first five pentagonal numbers, and the first five hexagonal 
numbers. 

(b) Note that we can write a pentagonal number such as 35 using a 
sum of the differences between successive pentagonal numbers: 

35 = 1 + 4 + 7 + 10 + 13. 

We can do the same thing for hexagonal numbers: 

45 = 1 + 5 + 9 + 13 + 17. 


By the way, it should be easy for you to see what these differences 
represent in the patterns you drew in part (a). 

Write a formula for the nth pentagonal number p n using a 
sum of the differences between the first n pentagonal numbers. 
Your formula should look a lot like the formula we know for the 
nth square number: 1 + 3 + 5 + • • • + (2 n - 1). 

(c) Write a formula for the nth hexagonal number h n using a sum of 
the differences between the first n hexagonal numbers. 

(d) Find closed formulas for p„ and h n ; that is, find formulas similar 
to the formula t n = Then use induction to verify these 
formulas. For the induction step you will need your formulas 
from parts (b) and (c). 

Nicomachus extended this idea and also considered septagonal and 
octagonal numbers for which there are similar patterns and formulas. 

2.13 (H,S) A number is called trapezoidal if it is the sum of two or more 

consecutive positive integers. The name arises because a trapezoidal 
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number of stones can be arranged into a trapezoidal array; a 
trapezoidal number can also be expressed as the difference of two 
triangular numbers (see “Trapezoidal Numbers” by J. Watkins, C. 
Gamer, and D. Roeder, Mathematics Magazine 58(2) (March 1985), 
108-10). For example, 18 is trapezoidal because 18 = 3 + 4 + 5 + 6, and 
we can write 18 = 4 - t 2 = 21 - 3. For convenience, we will think of 
to — 0 also being a triangular number. 

Prove that all positive integers are trapezoidal except powers of 2. 

2.14 * (S) Several times in the text we have presented visual proofs for various 
formulas. For example, in Chapter 1 we used Figure 1.3 to make the 
truth of the formula 1 + 3 + 5 + • • • + (2n - 1) = n 2 immediately clear. 
Similarly, Figures 2.2 and 2.3 were used to prove two formulas 
involving triangular numbers. This “look and see” method of proof is 
very powerful, but is quite different from the axiomatic method of 
proof in that it relies on things becoming self-evident once we look at 
them in just the right way. These are called diknume proofs, and were 
frequently used in Greek mathematics ( diknume is a Greek word 
meaning “to make visible or evident”). 

Today we think of numbers algebraically, and we would prove a 
statement such as “the sum of two odd numbers is an even number” as 
follows: let 2 / + 1 and 2k + 1 be two odd numbers, then their sum is 
given by (2/ + 1) + (2k + 1) = 2; + 2k + 2 = 2 (/ + k + 1), which is an 
even number. 

The Greeks, however, thought of numbers geometrically, and as we 
show in Figure 2.9, even numbers are represented by two rows of stones 
where the length of the two rows are the same, and odd numbers have 
one row with a single extra stone. 

(a) Give a diknume proof that “the sum of two odd numbers is an 
even number.” Which proof do you like better, the modern 
algebraic proof given above, or the diknume proof? 

(b) Give a diknume proof of Proposition 22 from Book IX of Euclid's 
Elements: 

Proposition 22. If as many odd numbers as we please be added 
together, and their multitude be even, the whole will be even. 


2.15 Fill in the details of the following proof that the sum of the reciprocals 
of the triangular numbers is 2, that is, 


1 

T 


1 1 J_ J_ 

3 + 6 + 10 + 15 


■ ■ = 2 . 
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ooooo ooo 

ooooo oooo 

Figure 2.9 Even and odd numbers. 


(a) On the interval [1, oo), define the functions 

m = l s(x) = ^T~v 

Draw a careful graph of these two functions and note that they 
are both hyperbolas that approach 0 as x gets large. 

(b) For each positive integer n, verify that f(n) - g(n ) = j-. 

(c) Now, either give an algebraic argument or use your graph to give 
a diknume style geometric argument to show that 


11111 

— “h — — H - — -f- — 

1 3 6 10 15 


2. 


2.16 * Give an alternative proof for Theorem 2.1 by assuming there is a largest 
prime p and considering the number N- p\ + 1. (The expression kl is 
read “k factorial’’ and represents 1 • 2 • 3 • ■ ■ (k - 1) • k. Thus, 

4! = 1 ■ 2 • 3 • 4 = 24.) 


2.17 ★ (H,S) This problem asks you to discover for yourself the famous proof 

that G. H. Hardy said had not “a wrinkle” after two thousand years! 

Either sfl is rational or not. One of these two alternatives must be 
true. Prove that -J2 is irrational by eliminating the other 
alternative— that is, assume that V2 is rational, and then show that 
this alternative is impossible by reaching a contradiction, or point of 
absurdity. 

2.18 (S) If you compute log 2 on a calculator — here, log means the logarithm to 

the base 10— you will get something like log 2 % .30103. Prove that 
log 2 is an irrational number. 

2.19 (H) Let N be the norm function defined on the complex number system 

discussed in this chapter. Prove that if a = a + bsT 1 6 and p = c + ds/^6 
are two numbers in this number system, then 

N(a/3) = N(a)N(P). 

2.20 Find another example of a number with two different prime 
factorizations by considering the number system consisting of all 
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complex numbers of the form a + bsT^ 5 where a and b are integers. 
(You need not bother to verify that the individual factors are prime.) 

2.21 (S) We saw in the text that if we wanted to have a thirteenth note that was 
an F a perfect octave above the original F, then we had a slight problem 
with our Pythagorean tuning. One way to deal with this problem is to 
tune partway up the scale from the original F, and then tune down the 
rest of the scale from this high F. (This doesn’t avoid the problem, of 
course; it just moves it somewhere else.) 

So, for example, the pitch ratio of that high F should be |. Therefore, 
if we tune down from this note to «i 2 , we need to divide by |, so the 
pitch ratio for nu becomes |. Next, we could tune n u , and so on. Of 
course, when tuning down, whenever a note falls below n \ , you need to 
bring it back up an octave. 

Use this minor variation in Pythagorean tuning to finish filling in 
the pitch ratios for the twelve notes of the scale by working up the scale 
for n\ rig, and then down the scale for nu, n n , nw- 


«1 

«8 

«3 

«10 

«S 

Hl2 

n 7 

n 2 

n<) 

«4 

Hll 

He 

1/1 


9/8 



4/3 


3/2 


27/16 




Now, assuming a standard modern tuning of A = 440 Hz, give a 
Pythagorean tuning correct to two decimals for a full octave beginning 
with C below A. In other words, fill in the following table (and 
compute F, Bb, and Eb down from “high C,” while still tuning the rest of 
the notes up from the first note — which, in this case, is C): 


C 

”qT 

D 

Eb 

E 

F 

F# 

G 


A 

Bb 

B 

C 

260.74 





347.65 


391.11 


440 





2.22 (H) In 2004 the mathematical puzzle KenKen was invented by the 

Japanese math teacher Tetsuya Miyamoto as a tool to improve the skills 
of his students. This sudoku-like puzzle quickly gained popularity and 
now appears daily in newspapers throughout the world. As in sudoku, 
the idea is to fill an n x n grid with the numbers 1- n so as not to repeat 
a number in any row or column. But, quite unlike sudoku, KenKen 
depends heavily upon several important ideas from number theory 
such as prime factorization and the notion of how many different ways 
a given integer can be partitioned into a sum of smaller integers. 

In this problem we will see that the ancient Greek concept of 
triangular numbers can often be used in solving KenKen puzzles. 
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24x 



5x 


10+ 

2 

1 - 




4- 

12x 

3 

2+ 






10+ 



1 - 



Figure 2.10 


This is due to the fact that in any solution to a KenKen puzzle the sum 
of the numbers in any row or column of the grid is 1 + 2 + ■■■ + », that 
is, the nth triangular number. 

(a) Figure 2.10 is a typical KenKen puzzle where each heavily 
outlined “cage” indicates what must happen inside that 
particular cage. For example, inside the cage labeled 24 x the 
product of the three numbers must be 24 (hence, in this 5x5 
grid, the three numbers are forced to be 4, 3, and 2); or in the 
cages labeled 1 - the difference between the two numbers must 
be 1 (the two numbers could be 1 and 2, or 2 and 3, or 3 and 4, or 
4 and 5). In this puzzle, of course, we know that the sum of the 
numbers in any row or column must be 15 (that is, 

1 + 2 + 3 + 4 + 5). So, in the bottom row since the sum is 10 in 
the first cage, the sum in the second cage labeled 1- must be 5. 
Hence, that cage contains 2 and 3 in some order, but there is 
already a 3 in the 4th column so we can immediately place the 2 
and 3 in their correct positions in the bottom row. At this point 
we use a similar argument for the left-hand column to see that 
the sum of the two numbers in the upper and lower left-hand 
corners must be 5. Thus, these numbers are either 2 and 3, or 1 
and 4. But they can’t be 2 and 3 because there are already both a 

2 and a 3 elsewhere in the bottom row; therefore, we know these 
two numbers must be 1 and 4. Finish solving this puzzle. 
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Figure 2.11 


3- 

1 - 


1 2x 


2 -r 

a 



1- 

b 

9+ 



4 




(b) Figure 2.11 is a 4 x 4 KenKen puzzle that normally would be 
solved using trial and error. For example, it is clear that the 3- 
cage contains a 1 and 4 and hence the cage below this contains a 
2 and 3, but it seems that a good deal of experimenting needs to 
take place to determine the exact placement of these four 
numbers. On the other hand, we can use the notion of triangular 
numbers to completely avoid this tedious trial-and-error process. 

It is clear that the two numbers a and b in the 2-t- cage must be 
1 and 2 in some order. Use the fact that the sum of the numbers 
in the two bottom rows is 20 (an even number) to determine 
whether b is odd or even and hence whether b = 1 or b = 2. Then 
finish solving the puzzle without any trial and error. 

(c) Figure 2.12 is a fairly difficult 6x6 KenKen puzzle that can be 
solved with the aid of triangular numbers. We will also use this 
particular puzzle to illustrate another very useful and surprising 
puzzle solving technique. 

As always, we wish to avoid resorting to trial and error. First, 
we focus on the 4x cage in the bottom two rows. The question is 
whether the 4 factors as 4 • 1 • 1 or as 2 ■ 2 • 1 . In either case, since 
the 120 in the 120 x cage in these two rows must factor as 6 • 5 • 4, 
we can conclude that the 2-h cage in the bottom row does not 
contain 2 and 4. Hence this 2-h cage contains either 1 and 2, or 3 
and 6, and either way the sum of the two numbers in this cage is 
odd. Use this information and the fact that the 6th triangular 
number is 21 to determine the values of a, b, and c. 

Now, quite independently of this argument using triangular 
numbers we can also determine the value of d using another 
clever, though perhaps not entirely fair, argument involving the 



56 


Chapter 2 


12x 


• 

11 + 


2 

12+ 


1 - 


3+ 



120x 

3- 

2+ 

3+ 

7+ 







4x 

a 

c 

3 

120x 

10+ 


b 

2+ 


d 




Figure 2.12 


uniqueness of the solution to this problem. It is usually implicit, 
though rarely stated, in puzzles such as KenKen and sudoku that 
there is a unique solution to the puzzle. This fact can sometimes 
be invoked to aid in finding the solution. Consider the situation 
in the 120 x cage in the bottom two rows in this puzzle. Look at 
the 11+ cage in the top row in the 4th and 5th columns. Since 
this cage must contain 5 and 6, if we assume that there is a 
unique solution to this puzzle, then there cannot also be a 5 and 
6 in the two squares in the bottom row in the 120x cage. 
(Otherwise, simply switching the 5s and 6s in the top and 
bottom rows yields two distinct solutions.) Under this 
“uniqueness” assumption, determine the value of d. Then, 
without any use of trial and error, finish solving this puzzle. 

2.23 An expression such as 




2 + V2 + • • • 
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is called an infinite nested radical. It is not hard to guess the value of this 
particular expression because we see that the sequence of radicals 

Si = V2 % 1.41421, 


S2 


2 + 72 » 1.84776, 


S3 = 



1.96157, 


s 4 = 



% 1.99037, 


ss = 



1.99759, 


seems to be converging to the number 2. Let’s confirm this. 

It is clear that the sequence Si , sz, S3 , . . . , is an increasing sequence 
of positive numbers simply because 2 + V2 >2. Therefore, either this 
sequence converges to a real number or it diverges to infinity. 

(a) Prove that this sequence converges— that is, it does not diverge 
to infinity— by using induction to prove that s k < 2, for each 

k > 1 . (Note that s k = + s*_i for each k > 1 .) 

(b) Let nr be the real number to which this sequence converges, that 
is, 


m — V 2+ y2+ v // 2W2T == . 


Note first that m — V2 + m, and then prove that m = 2 by solving 
an appropriate quadratic equation. 

(c) Show that the infinite nested radical 




1 + Vl + ■ • • 
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also converges to a real number. Then find that number by 
solving an appropriate quadratic equation. This famous 
irrational number is called the golden ratio. 

(d) Now, let n > 2 be an integer. Consider the infinite nested radical 



Prove that this sequence converges — that is, it does not 
diverge to infinity — by using induction to prove that s* < V2 n, 
for each k > 1. (Note that since n > 2, 2n < n 2 .) 

(e) Let m be the real number to which this sequence converges, that 
is, 



Prove that m is an integer if and only if n is equal to twice a 
triangular number. 

For example, if n = Zt 3 = 2 • 6 = 12, it turns out that 


12 + I/ 12 + V 12 + \/l2 + • • • = 4. 


Divisibility 


In the first two chapters we saw that, although modern number theory 
as we know it may have begun with the work of Fermat, its true source 
lay in the mathematics of the ancient Greeks. Now, in this chapter we 
are going to step briefly out of the historical flow of the story of number 
theory in order to discuss division. We want to do this carefully, and 
from a modern point of view, before returning to the Greeks and to 
Fermat in Chapters 4 and 5. We have two primary objectives in this 
chapter. 

One very important goal is to prove that the integers do have unique 
factorization— that is, every integer greater than 1 can be written in just 
one way as a product of one or more prime numbers. We saw in the 
last chapter that there are in fact number systems in which numbers 
can be written in more than one way as a product of primes, so we are 
going to have to be very careful indeed to prove that the integers have 
the property of unique factorization! 

The other main objective in this chapter is the introduction of the 
notion of congruences — a topic and notation related to division that 
would have been extremely useful to Fermat. This is a very natural 
concept that was developed as a formal way to deal systematically 
with remainders. Congruences are an excellent example of how much 
simpler a mathematical concept can be once there is a good notation 
and good language for us to use. 


The Euclidean Algorithm 

It turns out that the proof of unique factorization for the integers 
depends on a fundamental process known as the Euclidean algorithm, 
which first appeared, as you might guess, in Euclid’s Elements. This algo- 
rithm produces in a step-by-step fashion the greatest common divisor — 
that is, the highest common factor — of two numbers. Let's see how the 
Euclidean algorithm works by looking at a simple example. 

Begin with two numbers 30 and 72. Divide the smaller number into 
the larger one and get a quotient 2 and a remainder 12, and write 
72 = 2 ■ 30 + 12. Now, since 12 < 30 (that is why 12 was a remainder, 
after all), we can repeat this process and now divide 12 into 30 to get 
another quotient and remainder, and write 30 = 2 • 12 + 6. Repeat once 
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more, dividing 6 into 12, and write 12 = 2 • 6 + 0. At this point the 
new remainder is 0 and the process stops, and we conclude that 6 is the 
greatest common divisor of 30 and 72. Here are the three steps: 


72 = 2 • 30 + 12, 


30 = 2- 12 + 6, 


12 = 2-6 + 0 . 


How do we know that the algorithm has produced the greatest 
common divisor? Well, we can tell that 6 is a common divisor of these 
two numbers 30 and 72, because the last equation shows us that 6 
divides 12, and so, the previous equation then shows us that 6 divides 
30 (since 6 divides both 6 and 12), and finally the first equation then 
shows us that 6 divides 72 (since we now know that 6 divides both 12 
and 30). 

But we still need to conclude that 6 is the greatest common divisor. 
Perhaps some larger number divides both 30 and 72. In order to do this, 
we will first work our way backwards through the steps above to express 
6 as a linear combination of 30 and 72. 

So, begin with the next to last equation above, and write 

6 = 30 - 2 ■ 12. 

Then, working backwards, you can write the first equation as 12 = 
72 - 2 • 30. This expression can now be substituted for 12 in the previous 
equation to get 


6 = 30 - 2(72 - 2 • 30) = 5 • 30 - 2 • 72. 


At this point, notice that 6 has been written entirely in terms of 30 and 
72, that is, as a linear combination: 


6 = 5 • 30 - 2 • 72. 


Therefore — and this is the key idea — any number c that divides both 30 
and 72 must also divide 6, which means that 6 is the greatest common 
divisor, as claimed. 

It should be clear just from this simple example that the Euclidean 
algorithm will always produce the greatest common divisor of any 
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two numbers. Each step in the algorithm produces a quotient and a 
remainder. Eventually you must get a zero remainder. The last nonzero 
remainder is then the greatest common divisor. 


The Greatest Common Divisor 

It will turn out that the real importance of the Euclidean algorithm is 
not so much that it can be used to find the greatest common divisor 
of two numbers, but, as we saw in the example above, it can be used 
to express that greatest common divisor as a linear combination of 
the two numbers. In other words, if a and b are two natural numbers, 
then the Euclidean algorithm will first produce the greatest common 
divisor, which we write as gcd (a,b), and then, by reversing the steps in 
the algorithm, the greatest common divisor can be expressed as a linear 
combination of the two numbers— that is, it can be written as 

gcdffl, b) = xa + yb. 

Our use of the Euclidean algorithm will often be in the case where 
the two numbers in question are relatively prime — that is, when their 
greatest common divisor is 1— and so, for just a bit more practice, 
we will now use the Euclidean algorithm not only to show that the 
two numbers 2001 and 1984 are relatively prime, but to express their 
greatest common divisor in the form 1 = x ■ 2001 + y ■ 1984. 

The steps in the Euclidean algorithm produce the following sequence 
of equations: 


2001 = 1 ■ 1984 + 17, 

1984 = 116 • 17 + 12, 

17 = 1 • 12 + 5, 


12 = 2-5 + 2, 

5 = 2 • 2 + 1, 


2 = 21 + 0 . 
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We conclude that gcd(2001, 1984) = 1, since 1 is the last nonzero 
remainder. 

Now, in order to write 1 as a linear combination of 2001 and 1984, we 
begin with the next to last equation, and write 

1 = 5-22. 

Then, working backwards, we can write the next equation as 
2 = 12-2-5. This expression cannow be substituted for2in the previous 
equation to get 


1 = 5 - 2 • (12 - 2 ■ 5) = 5 • 5 - 2 • 12. 

Continuing backwards, we write 5 = 17 — 1 • 12, and substitute to get 

1 = 5(17 - 1 • 12) - 2 • 12 = 5 • 17 - 7 • 12. 

Next, write 12 = 1984 - 116 • 17, and substitute to get 

1 = 5 • 17 - 7(1984 - 116 • 17) = 817 • 17 - 7 • 1984. 

Finally, write 17 = 2001 - 1 • 1984, and substitute to get 

1 = 817(2001 - 1 • 1984) - 7 • 1984 = 817 • 2001 - 824 • 1984. 

At this point, notice that, as desired, 1 has been written as a linear 
combination of 2001 and 1984, that is, 

1 = 817 • 2001 + (-824) ■ 1984. 

The fact that the Euclidean algorithm allows us always to write the 
greatest common divisor of two numbers a and b as a linear combi- 
nation of the two numbers leads us to a way of thinking about the 
greatest common divisor that is very important. If we take all possible 
linear combinations xa + yb of a and b, then obviously some of these 
combinations will be positive, some will be negative, and some will be 
zero. Let’s focus on those that are positive. 

So, for two numbers a and b, not both zero, let 

S = {xa + yb | xa 4- yb > 0 ); 

that is, S is the set of all the linear combinations of a and b that are - 
positive. Thus, S is a nonempty set of natural numbers. Therefore, S has 
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a least element. (This is the well-ordering principle mentioned in the 
last chapter.) Guess what that least element is? Right, gcd(a,£>). 

When a = 2001 and b = 1984, since we just showed that 1 is a linear 
combination of 2001 and 1984, we know that 1 e S, and so 1 is clearly 
the least element in S. 

But what about when a = 30 and b = 72? We know that 6 e S, since 
we wrote 6 = 5 -30+ (-2) -72. But is it possible that some smaller positive 
integer can also be written as a linear combination of 30 and 72? There 
is still some work to do to prove this, but the answer is that gcd (a,b) 
is always the least element in 5. You will be asked to prove this in 
Problem 3.7. 


The Division Algorithm 

The key step that we repeat over and over again in the Euclidean 
algorithm, dividing a smaller number into a larger number to get a 
quotient and a remainder, is itself fundamental enough to warrant a 
name of its own, the division algorithm : 

Given any two integers a and b with b > 0, there exist unique 
integers q and r such that a = qb + r with 0 <r < b. 

We think of dividing a number b into a number a as many times as 
we can until the remainder r is smaller than b; Euclid, in the Elements, 
thought of marking off identical line segments of length b along a line 
segment a until a segment was left whose length r was less than b. Either 
way you think about it, it should be obvious that a unique quotient q, 
and unique remainder r are the result. 

Note that care has been taken to express the division algorithm in 
such a way that negative values of a are possible. So, not only does the 
division algorithm apply in cases that Euclid might have imagined such 
as a — 26, b = 7, where we get 


26 = 3 • 7 + 5, 


but it also applies to cases such as a = -17, b = 5, where we get 

-17 = (-4) -5 + 3. 

Following Euclid, you should pause at this point to visualize both 
of these examples on a number line. Think about where 26 falls on 
the number line between multiples of 7 to “see” why the remainder 
is 5. Similarly, think about where -17 falls on the number line between 
multiples of 5 to see why the remainder in that case is 3. 
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The division algorithm is sufficiently important that we are going to 
take the trouble to give a formal proof of it. However, since there can be 
no doubt whatsoever as to its truth, you would be justified in suspecting 
that our real motive in doing so at this point is to gain practice in the art 
of proving things. 

We begin with the well-ordering principle, which we now record as 
an axiom. 

Axiom 3.1 (well-ordering principle). Every nonempty set of natural num- 
bers contains a least element. 

We now immediately use this axiom to prove the division algorithm. 
You may notice in the proof that we apply the well-ordering principle 
not to a set of natural numbers, but to a set of nonnegative numbers. 
However, it is clear that if every nonempty set of natural numbers 
contains a least element, then every nonempty set of nonnegative 
numbers contains a least element. 

Theorem 3.1 (division algorithm). Given any two integers a and b with 
b > 0, there exist unique integers q and r such that a = qb+r withO <r < b. 

Proof 

We intend to use the well-ordering principle. The idea is to design a 
set of positive numbers whose least element will be the remainder r we 
seek. We begin with the two integers a and b where b > 0, but we have 
no idea whether a is positive, or negative, or zero. So, starting at a, we 
keep adding and subtracting b to construct the doubly infinite set 

S = {. . . , a - 2b, a - b, a, a + b, a + 2b, . . . }. 

Next, we simply remove any negative numbers from this set S. This 
leaves us with the nonnegative numbers in the set S: 

T — {a - kb e S \ a - kb >0}. 

(Note that the k values might be any integers here, positive, negative, or 
zero, as long as a - kb > 0.) 

Now, T is a nonempty set. This is of course obvious if a > 0, since 
then a = a - 0 • b e T. On the other hand, if a is negative, then all we 
have to do is take a sufficiently large negative value of k to make a - kb 
positive, so that a - kb will be an element of T (and, in fact, k = a will 
work, since a - ab = (-a)(b - 1) > 0). 
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Hence, by the well-ordering principle, T has a least element r, and 
we can writer = a—qb. Since r e T, r > 0. Furthermore, since r-b T — 
this is because r is the least element in T— we conclude that r - b < 0, 
and so r < b. Thus 0 < r < b, as desired. 

We still need to show that q and r are unique. So assume that p 
and s are integers such that a = pb + s with 0 < s < b. We may as well 
assume that r < s. Since qb + r = pb + s, we can write ( q - p)b = s -r; 
but, 0 < s - r < b, so {q - p)b must actually be 0; hence q = p and s = r, 
which proves uniqueness. This completes the proof. ■ 

By now it should be fairly clear to us that divisibility plays a central 
role in number theory. So let us pause here to be absolutely precise about 
what it means for us to say one number “divides” another number. This 
may not be something you have been in doubt about, but remember, we 
are being very careful in this chapter. 


Divisibility 

We begin at the beginning by defining what it means for one number to 
divide another number. A nonzero integer a is said to divide an integer b 
if there is an integer c such that b = ca. In this case, we write a\b, which 
we read as “a divides b”; we can also express the same thing by saying 
that a is a divisor of b, or that b is a multiple of a. In the event that a is not 
a divisor of b, we write a | b. 

So, for example, 17 divides 102 because there is an integer 6 such that 
102 = 6 17, but 100 is not a multiple of 17 because there is no integer c 
such that 100 = c 17. Similarly we can conclude that 14|(— 42), because 
-42 = (-3) ■ 14, but that -I4j( 1010. 

By the division algorithm, Theorem 3.1, we know that for integers a 
and b, when a > 0, we can write b = qa + r with 0 < r < a. So the 
statement that a divides b is equivalent to saying that the remainder 
r = 0. 

Several important divisibility properties are worth pointing out. 
A few of these properties are fairly obvious. For example, for any 
nonzero integer a, a\a and a |(— a); and for any integer a, l| a and — 1| a; 
in each of these cases you should be sure you know exactly what value 
of c in the definition of divisibility makes the property in question true. 

Similarly, for any nonzero integer a, a\0. (Why?) Note, on the other 
hand, that the definition of divisibility has been very carefully stated to 
avoid “dividing by 0.” 

Other properties of divisibility are less obvious, and require more 
formal proof. Many of these we often use without even thinking, for 
example: if a\b, and x is any integer, then a\bx. Here is a list of the 
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most important divisibility properties. You will be asked to give rigorous 
proofs for several of these properties in the problem section at the end 
of the chapter. 

The Basic Divisibility Properties 

1. a\a and a \ {-a), for all a=£0; 

2. l| a and -l|a, for all a; 

3. a| 0, for all a / 0; 

4. if a\b, then a | bx, for any x; 

5. ifa|handt?|c, then a\bx + cy, for any x, y; 

6. if a\bandb\c, thenujc; 

7. if a\b and b\a, thena = ±fi; 

8. if a > 0 and b > 0, and if a\b, then a <b. 

All of these properties can be proved easily from the definition of 
divisibility. Here is a proof of property (4): since a\b, there is an integer 
c such that b = ca, but then we can write 

bx = ( ca)x = ( cx)a , 

and therefore, a\bx. Notice how the last step in the expression above ex- 
hibits the number cx playing the same role that c plays in the definition 
of divisibility. 

Let’s do one more proof. Here is a proof of property (7). First, note 
that since the hypothesis says that a\b and b\a, we already know that 
neither a nor b is 0. Now, since a\b, there is an integer c such that b = ca. 
Similarly, since b\a, there is an integer d such that a = db. So, we can 
write 


b = ca = c(db) = cdb. 

Since b ^ 0, we can now divide the equation b = cdb by b, and get 
1 = cd. But c and d are both integers, so d = ±1. Therefore, a = ±b. 

Now let us turn to a crucial property of divisibility that is not at 
all obvious — that is to say, it is not obvious unless we assume unique 
factorization of the integers, which, after all, is one of our primary goals 
in this chapter. Suppose we know that 6|c for some number c, and we 
also know that 5|c. Isn’t it clear that 30|c? Well, the reason we think 
this is obvious is that, since 6|c, we know that 2 and 3 are part of the 
prime decomposition of c; and, since 5|c, 5 is also in this same prime 
decomposition, so that should mean that 2 • 3 • 5 = 30 is in the prime 
decomposition of c. However, we can conclude this only because we 
are automatically assuming there is a unique prime factorization for c. 
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Maybe instead there is one factorization of c that involves 2s and 3s and 
possibly a few other primes but no 5s, and then a completely different 
factorization of c that has a 5 in it but is missing 2s or 3s. In that case, 
30 would not divide c. 

Still, in this example, we certainly believe that 30 must divide c. How 
can we arrive at this conclusion without assuming the very thing we 
are trying to prove? The solution to this dilemma is to be found in the 
Euclidean algorithm, which allows us to prove exactly the theorem we 
so strongly believe must be true! Recall that two integers a and b are 
relatively prime if gcd(a, b) = 1. For example, 6 and 5 are relatively prime. 


Theorem 3.2. Let a and b be two relatively prime integers. If a\c and b\c, 
then able. 


Proof 

Since gcd(a, b) = 1, the Euclidean algorithm allows us to express this 
greatest common divisor of a and b as a linear combination of a and b. 
That is, we can write 1 = xa + yb, for some integers x and y. This turns 
out to be the key step in the proof. 

Now, since a\c, there is an integer u such that c = ua. Also, since 
b\c, there is an integer v such that c = vb. Multiplying 1 = xa + yb by c, 
and then substituting, we get 


c = exa + cyb — (vb)xa + {ua)yb = (vx + uy)ab, 


and we conclude that ab\c, by the definition of divisibility. This com- 
pletes the proof. ■ 


As a simple example of how this theorem can be applied, we will show 
that n 3 - n is divisible by 6 for every integer n. We do this by showing 
separately that 2 1 n 3 - nand that 3|n 3 — n. Then, by Theorem 3.2, we can 
conclude that 6| n 3 - n. 

Since n 3 is odd when n is odd, and even when n is even, we see that 
n 3 - n is even, that is, 2|n 3 - n. To show that n 3 - n is divisible by 3, we 
write n 3 - n — n(n - l)(n + 1), and observe that for any integer n, one of 
the three consecutive integers, n - 1, n, or n + 1, must be divisible by 3. 
Hence, 3| n 3 - n. Therefore, we conclude that n 3 - n is divisible by 6. 

In Problem 3.13, you will be asked to prove similarly that n s - ti is 
always divisible by 30 for every integer n. 
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The Fundamental Theorem of Arithmetic 

Now we are ready to turn our full attention to unique factorization. 
In Proposition 30 of Book VII of the Elements, Euclid identified the 
one property of prime numbers that is now considered to be every bit 
as important as their irreducibility. This property of prime numbers— 
the property which opens the door to unique factorization— now fre- 
quently goes by the name of Euclid’s lemma (although a slightly more 
general version of the same property is often given this same name, a 
version presented to you in Problem 3.18). 

Theorem 3.3 (Euclid's lemma). If p is prime and p\ab, then either p\a or 
p\b. 

Proof 

Suppose that p\a. Then, gcd(p, a) = 1, and so, by the Euclidean algo- 
rithm, we can write 1 = xp + ya, for some integers * and y. Multiplying 
by b, we get b = xpb + yab. 

But p\ab, so we can also write ab = cp. Therefore, 

b = xpb + y {cp) = ( xb + yc) p, 


and p\b, as desired. 

Note that we could have also used property (5) of the basic divisi- 
bility properties to conclude that p\b, since p divides each of the terms 
xpb and yab in the expression b = xpb + yab. 

This completes the proof. ■ 

There are several comments to make here about Euclid’s lemma. 
Perhaps the first is that there is obviously nothing special about having 
only two terms a and b in the lemma; it would be equally true if there 
were three or four or any number of terms, in other words: if p is prime 
and p\a\a 2 ■ ■ ■ a n , then p\a t for some i (see Problem 3.19). 

The next comment is that Euclid really did get it right in that the 
property stated in Theorem 3.3 is a crucial one that can completely 
distinguish prime numbers from composite numbers in the integers. 
For example, 6 1 10 - 15 = 150 since 150 = 25 • 6, yet 6 \ 10 and 6 \ 15. 

Thus, we now have two completely different ways of looking at the 
prime numbers {2, 3, 5, 7, 11, . . . ): 

1. they are the integers greater than 1 that are irreducible, that is, 
they are not composite; or 

2. they are the integers greater than 1 that have the property 
described in Theorem 3.3. 
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More than twenty-three centuries have passed since Euclid taught us to 
see prime numbers in these two distinct ways. 

In Problems 3.23 and 3.24, you will see examples of number systems 
in which these two properties are not shared in the same way as they 
are in the integers. Therefore, for such algebraic systems two different 
terms will be needed: 

1. the term irreducible element will be used for elements or numbers 
that can’t be factored except trivially; and 

2. the term prime element will be used for elements or numbers that 
satisfy the property described in Theorem 3.3. 

We are now poised to prove unique factorization for the integers, 
and we shall soon see that the proof will depend entirely on Euclid’s 
lemma. So, it is worth wondering why this fundamentally important 
result did not appear in Euclid’s Elements. How could Euclid have 
missed it? 

In some sense, he had to have known this result, yet in another 
sense, he couldn’t have. These days, we have a notation that didn’t even 
exist in Euclid’s time that allows us to think effortlessly of products of 
any length, and with any exponents whatsoever. We can write 60 = 
2 2 • 3 • 5, or n = P\p 2 ■ ■ ■ pk, but Euclid did not have that luxury! 
Notation in mathematics matters, and greatly affects the way we think 
about things. This is very similar to the way in which the particular 
language you speak shapes your thoughts. Unique factorization, which 
is now such a central and important concept in modern number theory, 
simply did not exist as part of the mathematical language in Euclid’s 
day. 

Before we prove unique factorization for the integers, let's be clear 
about what it is that we are proving. When we say there is only one way 
to factor 60 into primes, we need to be careful that we don’t allow, for 
example, 60 = 2 • 2 • 3 • 5 and 60 = 3 • 2 • 5 • 2 to count as two different 
factorizations, or representations, of 60. That’s why, in the following 
theorem, we state explicitly that the order of the factors doesn’t matter 
to us; it is only the factors themselves that count. 

Theorem 3.4 (fundamental theorem of arithmetic). Every integer 
greater than 1 can be represented as a product of one or more primes , and, 
except for the order of the primes, this representation is unique. 

Proof 

Let n be an integer greater than 1. By Theorem 2.2, we already know 
that n can be written as a product of one or more primes. All that is left 
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to show is that this representation is unique. Suppose that n has two 
representations as products of- primes: 

n = p\ ■ p 2 ■ ■ ■ pj and n = q x ■ q 2 ■ • ■ q k . 

We need to show that these two representations are, in fact, the same. 

First, we see that p\\n, so by Euclid’s lemma (see Problem 3.19), 
pi \cji for some i. We might as well assume that pi \ q x . But q x is prime, 
so p\ = q x . Thus, 


P 2 ■ ■ pj = q 2 ■ ■ ■ qk- 

Now, we see that p 2 \q 2 ■ • • qk, so by Euclid’s lemma, p 2 \qi, for 
some prime q h where 2 < / < k. We might as well assume that p 2 \q 2 . 
But q 2 is prime, so p 2 = q 2 . Thus, 

p3 ■ ■ ■ Pj = qz • ■ ■ qk- 

We can of course continue in this fashion, and show that p 3 = q m 
for some prime q rn , where 3 < m < k, and we might as well assume that 
p 3 = q 2 - Thus, prime by prime, we can show that each prime p on the 
left equals a prime q on the right. Moreover, when we get to the last 
prime pj on the left, we will also necessarily get to the last prime on 
the right, since if at any stage you run out of primes on only one side, 
then you also have an expression for 1 as a product of primes, which is 
impossible. 

Therefore, these two representations of n are the same after all. 
This completes the proof. ■ 


As simple and straightforward as Theorem 3.4 is — after all, its proof 
depends on ideas that were available to the ancient Greeks — the funda- 
mental theorem of arithmetic is a theorem that belongs to the modern 
age. It was first proven by Gauss in 1801, when he was only twenty- 
four years old. Because it was Gauss who gave the first statement and 
proof of Theorem 3.4, algebraic systems that share with the integers 
the property of having unique factorization are called either Gaussian 
domains, or sometimes unique factorization domains. 

Before we turn to our second main objective in this chapter, consider 
a final comment about notation. While Theorem 3.4 tells us there is 
only one way to factor a number such as 107 207 100, we still have 
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many options as to exactly how we choose to write that factorization. 
Here are just a few of those options: 

2 • 2 • 3 • 3 • 5 • 5 • 7 • 7 • 11 • 13 • 17, 

7 • 2 • 3 • 17 ■ 5 ■ 3 • 11 • 7 • 5 • 13 • 2, 

17 ■ 13 • 11 • 7 • 7 • 5 • 5 -3 • 3 • 2- 2, 

2 2 (3 • 3 ■ 5 ■ 5 ■ 7 ■ 7 • 11 • 13 ■ 17), 

(2 • 3 • 5 • 7) 2 (11 • 13 • 17), 

(2 • 3 • 5 ■ 7) (2 ■ 3 • 5 • 7) (11 ■ 13 • 17). 

Obviously, we could go on and on, arranging and rearranging these 
same numbers. 

However, as you can imagine, it would be useful to have a standard 
way to write prime decompositions. There are some natural choices. In 
some cases, the best choice is either ascending or descending order, that 
is, either 2 ■ 2 • 3 • 3 • 5 ■ 5 • 7 • 7 • 11 • 13 • 17 or 17 • 13 • 11 • 7 • 7 • 5 • 5 • 3 • 3 • 2 • 2. 

The other most common standard way to write a prime decomposi- 
tion is to group individual primes together and use exponents. This is 
called the canonical form. Also, in this form, the primes are arranged in 
ascending order. (This is, of course, arbitrary but is standard practice — 
which is part of the reason it’s called the canonical form.) So, for 
example, the canonical prime factorization for 107 207 100 would be 

107 207 100 = 2 2 • 3 2 • 5 2 • 7 2 • 11 • 13 • 17. 

In general, then, the canonical form for an integer n > 1 is 

n = p\ e 'p 2 ez ■ ■ ■ pk k , 

where the p, are all distinct primes written in ascending order, and all 
exponents are assumed to be positive. 


Congruences 

In the same year that he proved the fundamental theorem of arith- 
metic, Gauss also introduced the theory of congruences along with the 
terminology and notation we still use today. The idea behind the notion 
of congruences is based on the rather obvious observation that many 
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of the arguments used by Fermat ultimately boil down to very simple 
facts about remainders. The proof that no prime of the form 4n + 3 can 
ever be the sum of two squares, because squares must have either the 
form 4 n or 4n + 1, is a good example of this phenomenon. The general 
idea behind congruences, then, is to forget about the 4n part and do 
arithmetic using just the remainders. Gauss showed us how this can be 
made to work in a precise way. 

So, if in arguments such as these it is the remainders that matter 
rather than the numbers themselves, we should not distinguish be- 
tween 7 and 11 because each has the same remainder when divided by 
4, or between 0 and 28 for the same reason. This leads to the following 
formal definition. 

For a positive integer m, we say that a is congruent to b modulo m, and 
we write a = b (mod m), if m divides a - b; otherwise, a is not congruent 
to b modulo m, and we write a ^ b (mod m). 

So, we have 11 = 7 (mod 4), 28 = 0 (mod 4), and 11 ^ 7 (mod 3). 
The number m in this definition is called the modulus. This terminology 
is due to Gauss. Others had called this number, more simply, the divisor, 
but at this late date we are now stuck with modulus. Note that the 
definition does not explicitly say that a is congruent to b if a and b have 
the same remainders, but of course the net effect is the same. 

Now, the most fundamental idea behind the notion of congru- 
ences is that for any modulus m there are exactly m possible re- 
mainders, namely, {0,1,2 m— 1), and therefore division by 

m partitions the integers Z into m distinct sets of integers where 
the integers in each set have the same remainder modulo m. 
Thus, for example, if the modulus is 4, then the four sets in the 
partition would be {. . . , -4, 0, 4, 8, 12, ... }, {. . . , —3. 1, 5, 9, . . . }, 
{. . . , -2, 2, 6, 10, . . . }, and { -1, 3, 7, 11 }. 

We think of numbers in the same set as being equivalent since they 
have the same remainder. This is the genius of Gauss’s notation = 
because it makes us think “equals” which is exactly what we should be 
doing. In this way congruence is an excellent example of an idea that 
is frequently used in mathematics when we want to think of different 
objects as being equivalent in some way even if they differ in other ways. 
For instance, we talk about two triangles being similar if their shape is 
the same even if their size is different. The general idea, then, is that an 
equivalence relation on the elements of a set partitions the elements of 
that set into “equivalent” sets. 

To introduce the idea of an equivalence relation, and to record 
three of the most basic properties of congruences, we state the 
following results as a theorem. The proof is left as an easy exercise in 
Problem 3.26. 
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Theorem 3.5. Congruence modulo m is an equivalence relation; that is, for 
integers a, b, and c 

(1) a = a (mod m); 

(2) if a = b (mod m), then b = a (mod m); and 

(3) if a = b (mod m), and b = c (mod m), then a = c (mod m). 

These three properties of any equivalence relation are called, 
respectively, the (1) reflexive property, (2) symmetric property, and (3) 
transitive property. 

Now, the next theorem is the really important one about congru- 
ences, because it is the one that tells us that we can do arithmetic the 
way we expect to: “on the remainders”. 

Theorem 3.6. For integers a, b, c, and d, if 

a = b (mod m) and c = d (mod m ) 


then 

(1) a + c = b + d (mod m); 

(2) a - c = b - d (mod m); and 

(3) ac = bd (mod m). 

Proof 

By hypothesis, m divides both a - b and c - d. Therefore, since 

(a + c) - (b + d) = (a - b) + (c - d), 

m also divides (a + c) - (b + d), which proves the first property. 

The proof of the second property is almost identical since we can 
write 


(a - c) - (b - d) = {a - b) - (c - d). 


and so, m divides (a - c) - (b - d). 

For the third property, we write 

ac — bd = ac - be + be - bd = (a - b)c + b(c - d), 

and m divides ac - bd. This completes the proof. ■ 

Theorem 3.6 says that you can deal with congruences the same way 
you are used to dealing with equations. With an equation you can 
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add, subtract, or multiply both sides by a number, and the result will 
still be an equation. With a congruence you can now add, subtract, or 
multiply both sides by congruent integers, and the result will still be a 
congruence. 

The one thing you may have noticed missing from Theorem 3.6 is 
anything equivalent to saying you can divide both sides of a congruence 
by congruent integers. We do this to equations all the time, but the 
reason this is missing from Theorem 3.6, of course, is that it isn’t true. 
Take a simple example such as 

10 = 6 (mod 4). 

If we divide both sides of this congruence by 2, then we no longer have 
a congruence since clearly 5^3 (mod 4). 

However, if you are careful to divide, or “cancel”, both sides of a 
congruence only by an integer that is relatively prime to the modulus, 
then the result will still be a congruence. 

Theorem 3.7. Let c be an integer that is relatively prime to the modulus m. 
Then , for ca = cb (mod m), it follows that a = b (mod m). 

Proof 

By hypothesis, m\ca - cb; that is, m\c(a - b). But, since gcd(m, c) = 1, we 
conclude that m\a - b (see Problem 3.18). Hence, a = b ( mod m). This 
completes the proof. ■ 


Divisibility Tests 

One of the easiest applications of the notion of congruences is to the 
familiar idea of divisibility tests. Almost everyone knows the standard 
test to determine whether a number is divisible by 3: you simply add 
the digits, and the number will be divisible by 3 if, and only if, this sum 
is divisible by 3. Thus, 111111 is divisible by 3, but 1111111 isn’t. (Try it: 
111111 =3-37 037, but = 370 370.333 . . . .) 

The reason this divisibility test works is easy to detect. If we write an 
integer N as 

N = d\ ■ 1 + ■ 10 + d% ■ 10“ + • • • + rff+i • 10 r 

in terms of its digits d\, dz, ... , d r+ \, then the sum S of its digits is 


S — d\ + dz + • • • + £?r + 1 . 
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So, the difference between N and the sum S of its digits is 


N - S = d 2 ■ 9 + d 3 ■ 99 + • • ■ + d r+l ■ 999. . .9, 


which is divisible by 3. Thus, 


N = S (mod 3). 


Hence N is divisible by 3 if and only if the sum S of its digits is divisible 
by 3. 

An almost identical congruence argument lies behind the standard 
test to determine whether a number is divisible by 9: you again add 
the digits, and the number will be divisible by 9 if and only if this 
sum is divisible by 9. For example, we immediately know the number 
123 456 789 is divisible by 9 because 1 + 2 + • ■ ■ + 9 = ^21 — 45, which 
is divisible by 9. (Or you can cheat and use your calculator to discover 
that 123 456 789 = 9 • 13 717 421.) 

There is a similar, but much less well known, test to determine 
whether a number is divisible by 11: you take an “alternating” sum for 
the digits, and the number will be divisible by 11 if and only if this sum 
is divisible by 11. Thus, for example, 96 969 697 is divisible by 11 since 
9-6+9-6+9-6+9-7 = 11, but 85 858 585 isn't divisible by 11 
since 8-5 + 8- 5 + 8- 5 + 8- 5 = 12. 

We can see that this divisibility test works by again writing a number 
N in terms of its digits as we did above, and then the alternating sum of 
its digits will be d\ - d 2 + d 3 - ■ ■ ■ ± d r+ 1 , where we have begun with the 
Is digit for convenience. So the difference between the number N and 
this sum is 


d 2 • 11 + d-i ■ 99 + d 4 ■ 1001 + d s ■ 9999 + ■ ■ ■ + d r+1 • (I0 r + (-l) r+1 ). 


Now, as before, it is supposed to be the case that this difference is 
congruent to 0 modulo 11, but that is not immediately clear. Of course, 
it is clear that 11 divides each of the terms 99, 9999, 999 999, and so on, 
but does 11 also divide each of the terms 11, 1001, 100 001, 10 000 001, 
and so on, forever? Well, yes it does, and this is relatively easy to see 
since 1001 = 11-91; 100 001 = 11 • 9091; 10 000 001 = 11 • 909 091; and 


so on. 
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But an even better argument can explain why N is congruent to the 
alternating sum of its digits. Since 10 = -1 (mod 11) we can write 

N = d r ■ 1 + d 2 • 10 + d 3 • 10 2 + • • • + d r+1 ■ 10 r 

= d\ ■ 1 + c ?2 ■ ( — 1) + ■ ( — 1)^ + • • • + d r+ 1 ■ (— l) r (mod 11) 

and N is congruent to the alternating sum of its digits! 


Continued Fractions 

We will conclude this chapter on divisibility by introducing a topic that 
has a very old and rich history, continued fractions. Recall that the 
Euclidean algorithm produced the following sequences of equations to 
find the greatest common divisor for 30 and 72: 


72 = 2 ■ 30 + 12, 


30 = 2- 12 + 6, 


12 = 2-6 + 0 . 


We can rewrite the first equation to produce an equation for the 


fraction 


72 

30 



— 2 + 


1 

30~' 

12 


In turn, we can rewrite the second equation to produce an equation for 
the fraction : 


30 

12 


= 2 + 


6 _ 

12 


= 2 + 


1 

12 ~’ 

~6 
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which we can now substitute into the above equation for to get 


72 

30 



1 

12 

~6 


Finally, we rewrite the last equation from the Euclidean algorithm to 
write the fraction ~ as 2, and then express the fraction 22 as 


72 

30 


= 2 + 


1 


2 + 


1 

2 


This last expression for |2 is called a continued fraction. The three quo- 
tients produced by the Euclidean algorithm — in this case, they were all 
2 — are called the partial quotients for the continued fraction. 

In general, a continued fraction looks like 


q i + 


qz + 


<73 


d4 + 


and a continued fraction can be either finite — as is the example above — 
or infinite. The terms, q\, q 2 , . . . are called the partial quotients of the 
continued fraction. 

Any rational number can be written as a finite continued fraction 
because we can apply the Euclidean algorithm to the numerator and 
the denominator as we did for 72 and 30. Let’s do this again for the 
fraction , but in a slightly different way, without invoking the Euclid- 
ean algorithm directly. The goal is to produce the partial quotients 
<?i. q3, ■ ■ ■ • 

We begin by finding the integer part of the fraction ^|, that is, the 
greatest integer less than or equal to ^|. In this case, the integer part is 
3, and this is the first partial quotient q\. (We find this by dividing 113 
into 355, which you can do on your calculator if you like.) This leaves us 
with a fractional part of ^ . (We find this by computing — 24)2.) 

There is a convenient notation that we can use here, called the floor 
function : L~|j = 3. For a number x, the expression (xj — which we read 
as the “floor of x" — represents the greatest integer less than or equal to 
x. Thus Lfj = 2, \n\ = 3, \ 7\ = 7, and |_-7rJ = -4. An older name 
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for this same function is the greatest integer function (and the standard 
notation has long been [*]). There is a companion to the floor function, 
called the ceiling function, [x ] , whose definition you can easily infer from 
both the name and the notation (a previously used notation for this 
function has been {*}). 

Our first step can now be summarized as 


355 

U3 


355 16 16 

H3 + U3 ~~ " + 113' 


Next, we invert the fractional part and repeat the process to get the 
second partial quotient g 2 = 7: 


113 

"16" 


113 11 

16 + 16 ~ + 16' 


In this example, we are now done, because when we invert the 
fractional part T t we get an integer y , which is the final partial quotient 
= 16. Collecting the steps together, we have our continued fraction 


355 

U3 


= 3 


1 


7 + 


1 

16 


We can even use this same process to find a continued fraction 
representation for an irrational number such as n. If we write n as a 
decimal 3.141 592 653 589 . . . , we see that the first partial quotient is 
q\ = |_ttJ = 3. So the first step gives us 


jz = 3 + .141 592 653 589 .. . . 


Now, we invert the fractional part .141 592 653 589 . . . (using a cal- 
culator to compute 141 592 ^ 53 589 ), and repeat the process to get the 
second partial quotient g 2 = 7: 

7.062 513 305 930 • • • = 7 + .062 513 305 930 .. . . 


In Problem 3.32, you will be asked to continue this process to find 
the first six partial quotients for the continued fraction expression for 
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7z = 3.141 592 653 589...: 


jt ss 3 


7 + 


93 


<?4 + 


9s + 


96 


Note that the first two partial quotients yield 3 + that is, 22/7, 
a familiar and very old approximation for n that is correct to two deci- 
mal places. In Problem 3.33 you will use the first four partial quotients 
to find a much better approximation for n . 

Let’s use this same process to find a continued fraction represen- 
tation for the irrational number 73. For this example, we won’t use 
the decimal representation for 73, because we have a better idea in 
mind. 

The first partial quotient is q\ = [_73j = 1. You may find it easiest to 
use your calculator to find this integral part of 73, although we could 
also argue that 1 < 73 < 2, since 1 < 3 < 4. So, what is the fractional 
part? Well, the fractional part is what is left over after we subtract the 
integer part, so it is just 73 - 1. 

Next, we are supposed to invert this fractional part 73 - 1 to get the 
second partial quotient. When we invert 73 - 1, we get y=— j-, and it 
is kind of hard to tell what the integer part of this is. So what we do is 
multiply this fraction on both the top and bottom by 73 + 1. This is 
what we get: 


1 1 73 + 1 73 + 1 

s/3 - 1 “ s/3-1 ’ 73 + 1 “ 2 ' 

At this point, we now know that q 2 = 1. (This is because 2 < 73 + 1 < 3, 
or it is also fair to use your calculator here if you like.) 

Again, we compute the fractional part by subtracting the integral 
part, so for the fractional part we get v y +1 - 1 = ^1. 

Continuing, we invert this fractional part ^5+1 i n order to find the 
third partial quotient, and get 

= = 73 + 1. 

73-1 73-1 73 + 1 


Hence, q 2 = 2. 
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The fractional part is now (V3 + 1) - 2 = V3 - 1, and something 
very interesting has just happened. This is exactly the same fractional 
part that we inverted above when finding qz. In other words, the 
partial quotients will now just repeat themselves forever in the same 
pattern: 

<?2 = 1, <?3 = 2, £74 = 1 , q$ — 2, q 6 = 1 , q? — 2 

Thus the continued fraction representation for the irrational 
number \/3 is 


1 + 


1 + 


2 + 


1 + 


2 + 


an infinite, but repeating, continued fraction. 


Problems 

3.1 ★ We used the Euclidean algorithm to show that 6 is the greatest 

common divisor of 30 and 72. Show this using prime decomposition. 

Explain why this method of finding the greatest common divisor of 
two numbers depends upon the unique factorization property of the 
integers. 

3.2 We used the Euclidean algorithm to show that 2001 and 1984 are 
relatively prime. Show this using prime decomposition. 

3.3 Use prime decomposition to find the greatest common divisor of 1680 
and 2100. 

Describe a general method for finding the greatest common divisor 
of two numbers that uses prime decomposition. 

3.4 The least common multiple of two nonzero integers a and b is the least 
positive number that is a common multiple of a and b. We write the 
least common multiple as lcm (a,b). Find the least common multiple of 
1680 and 2100. 
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Describe a general method for finding the least common multiple of 
two numbers that uses their prime decomposition. 

3.5 Use prime decomposition to prove that for any two positive numbers 

gcd(a, b) • lcm(c?, b) = ab. 


3.6 (H) The terms greatest common divisor and least common multiple make 
sense for any set of two or more nonzero integers. For example, 
gcd(12, 21, 30) = 3 and lcm(12, 21, 30) = 420. 

Verify that for the three numbers a = 63, b = 70, and c = 90 


gcd(u, b, c) • lcm(a, b, c) = Vabc. 


Then characterize which sets of three positive integers a, b, and c 
satisfy this identity. 

3.7 (H,S) For two integers a and b, not both zero, let 

S = { xa + yb | xa + yb > 0 }, 

that is, S is the set of all the linear combinations of a and b that are 
positive. Let d be the least element in 5. Prove that d is the greatest 
common divisor of a and b, that is, prove that d = gcd(a, b). 

3.8 (H) Find seven positive composite numbers less than 360 that are pairwise 

relatively prime. Then prove that finding eight such numbers is 
impossible. 

3.9 * Prove property (5) of the basic divisibility properties: if a\b and a\c, and 

if x and y are any integers, then a\{bx + cy). 

3.10 Prove the following property of divisibility: if a\b\, a\b 2 , a\b n , and if 
xi, . . . , x„ are any integers, then a \(b\Xi + b 2 x 2 + • ■ ■ + b n x n ). 

3.11 Prove property (6) of the basic divisibility properties: if a | b and b\c, 
then a\c. This is known as the transitive property. 

3.12 Prove property (8) of the basic divisibility properties: if a\b and both 
numbers are positive, then a < b. 
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3.13 (H,S) Prove that « 5 - n is divisible by 30 for every integer n. 

3.14 * (H) In Chapter 1 we pointed out that the area of a Pythagorean triangle 

is always divisible by 6. This is because one leg is divisible by 3, and one 
leg — possibly the same leg — is divisible by 4. 

Prove that in a Pythagorean triangle it is also the case that one of the 
three sides is always divisible by 5. Thus we can conclude that if 
{x, y, zj is a Pythagorean triple, then 60 1 xyz. 

3.15 (H) Use induction to prove that 7" - 1 is divisible by 6 for all n > 0. 

3.16 (H) Use induction to prove that 11" - 4" is divisible by 7 for all n > 0. 

3.17 * It seems clear that if you take two integers a and b and divide them by 

their greatest common divisor d, then the resulting two numbers ^ and 
| will be relatively prime. Give a careful proof of this obvious, but 
important fact; that is, prove, without using prime factorization, that 
the greatest common divisor of ^ and ^ is 1. 

3.18 ★ Prove the following extremely useful result, which is a more general 

version of Theorem 3.3, and which also often goes by the name of 
Euclid's lemma: if two integers a and b are relatively prime and a\bc, 
then a\c. 

3.19 ★ (H,S) Use induction to prove the following generalization of 

Theorem 3.3 (Euclid’s lemma): if p is prime and •■■««, then p|a, 

for some i . 

3.20 * (H) Prove that every positive integer n has a unique representation 

n - 2 k m where k > 0 and m is odd. 

3.21 ★ An integer is said to be squarefree if it is not divisible by any square 

other than 1. Describe how to determine whether a number is 
squarefree just by looking at its canonical form. 

Prove that any positive integer n is the product of a squarefree 
integer and a square. 

3.22 (S) Show that the product of two consecutive positive integers can never 

be a square. 

Can the product of three consecutive positive integers be a square? 

3.23 (S) Use the complex numbers of the form a + b-J ^ 6 from Chapter 2 

where a and b are integers to find an example of a number that is prime 
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in the sense of being irreducible, but is not prime in the sense of 
satisfying the property from Theorem 3.3. (Thus this number would be 
called an irreducible element in this number system, but not a prime 
element.) 

3.24 (H,S) The integers modulo 6 is a number system consisting of just six 

numbers {0, 1, 2, 3, 4, 5} in which we do arithmetic modulo 6, that is, 
whenever we add or multiply two numbers in this system the result is 
the remainder upon division by 6 of the ordinary sum or product. So, 
for example, in this system 3 + 5 = 2 and 5-2 = 4. 

Find an example of a number in this system that is prime in the 
sense of satisfying the property from Theorem 3.3, but is not prime in 
the sense of being irreducible. 

3.25 Explain why the following two different factorizations of the number 
1001 do not contradict Theorem 3.4, the fundamental theorem of 
arithmetic: 

1001 = 7 • 143 = 11 • 91. 

3.26 ★ Prove Theorem 3.5. 

3.27 (H) The four-digit number 2310 contains each of the four digits 0, 1, 2, 

and 3 exactly once, and 2310 is the product of the first five primes: 

2310 = 2-3-5- 7- 11. 

Without multiplying it out, explain how you can quickly tell that 
the ten-digit number 

2 • 3 • 5 ■ 7 ■ 11 • 13 • 17 • 19 • 23 • 29 

cannot contain each of the ten digits 0, 1, 2, . . . , 9 exactly once. 

3.28 * (S) For a positive integer n, the notation z(n) is used to represent the 

number of divisors of n. Here is a table listing the first few values of 
T («)• 


n 

•! 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

r(n) 

1 

2 

2 

3 

2 

4 

2 

4 

3 

4 

2 

6 

2 

4 


Note that most values of x(n) are even. 
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(a) Make a conjecture about the values of n for which r (ri) is odd. 

(b) Give two proofs of your conjecture. One of your proofs should 
use the canonical prime decomposition of n, but there is an even 
easier proof. 

3.29 (H,S) Find the smallest positive integer with exactly twenty-six positive 

composite divisors. This problem appeared in the 2007 American 
Mathematics League Competition. 

Now, here is a problem not asked in this competition: find, 
in order, the ten smallest positive integers with exactly twenty-six 
positive composite divisors. (Be careful, there is one that is easy 
to miss.) 

3.30 (H,S) This problem has many colorful variations, and involves a jailer in 

charge of an unusual prison that has n prison cells all in a row. In some 
variations of this problem the jailer’s behavior is explained by excessive 
drinking; in other versions, he is given considerably higher motives for 
his actions. In any event, as the story begins, all n cells of this prison are 
locked. The jailer gets up from his desk and proceeds to turn his master 
key in the lock of each cell door, thereby unlocking each of the 
doors. 

We begin to suspect that the jailer has a mathematical bent, because 
he then goes back and, beginning with the second cell, proceeds to 
turn his key in the lock of every other cell door, thereby locking each of 
these doors. Next, he goes back and, beginning with the third cell, 
turns his key in the lock of every third cell door, thereby unlocking 
each of these doors. He continues in this bizarre fashion, at each stage 
going back to the kth cell and turning his key in every kth door. He 
does this through to the very end, that is, when he goes back to the nth 
cell and turns his key in that door. (In the excessive drinking version of 
the story, at this point the jailer simply passes out.) 

Which prisoners are now free to leave their cells? 

3.31 (H,S) We saw in Chapter 1 that the Babylonians used the number 60 as a 

base because it has so many factors. In fact, 60 is almost completely 
divisible since it is divisible by every positive integer less than \/60, 
except for 7; that is, it is divisible by 1, 2, 3, 4, 5, and 6. 

Find the largest number n that is completely divisible; that is, find 
the largest number n such that n is divisible by every positive integer 
less than or equal to y/n. 

3.32 * (S) In the text we began the process of finding a continued fraction 

representation for n = 3.141 592 653 589 .... Continue with this 
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process, finding the first six partial quotients in the infinite continued 
fraction expression for n\ 


71 = 3 + 


7 + 


<73 + 


<74 


<75 + 


<?6 + 


3.33 * (H,S) If we stop an infinite continued fraction after only finitely many 
partial quotients, the result is a rational approximation for the number 
represented by the continued fraction. For example, the first two 
partial quotients for the continued fraction for jz yield 3 + i, that is, 

Y- This approximation is correct to two decimal places since 
f % 3.14. 

Use the first four partial quotients from your solution to 
Problem 3.32 to find a better approximation for jt; that is, express 


3 + 


1 


7 + 


1 


<73 + 


1 

<74 


as a rational number Then use your calculator to write f as a 
decimal. 

This rational approximation for n was discovered in the fifth 
century by the Chinese astronomer and mathematician Zu Chongzhi, 
and is correct to six decimal places. Explain why Zu’s approximation is 
so accurate. 

We have no idea how Zu found this rational approximation; his 
original text has been lost. Zu also correctly approximated n to seven 
decimal places, presumably by applying a method involving inscribed 
regular polygons with 24 576 sides. His was the best approximation for 
it that was known anywhere in the world for over nine hundred 
years! 
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3.34 Use the same process we used for >/3 to find a continued fraction 
representation for the irrational number \fl ■ 

3.35 ★ (H,S) It is easy to find the number a that is represented by an infinite 

continued fraction such as 


2 + 


2 + 


because this continued fraction— much like a fractal — contains perfect 
copies of itself. 

Show that in this case, the number a satisfies the equation 


a = 2 + 


and then solve this equation to find a, the number represented by this 
continued fraction. 

3.36 ★ (S) The simplest possible infinite continued fraction is 


1 + 


1 + 


and this extraordinarily beautiful continued fraction represents the 
famous irrational number called th e golden ratio. Find this 
number. 

3.37 (S) Although the Euclidean algorithm is a very natural way to find the 
greatest common divisor of two numbers, it can be quite slow. 

A simpler algorithm was developed by Nicholas Saunderson, who 
from 1711 to 1739 was the fourth Lucasian Professor of Mathematics at 
Cambridge University (Isaac Newton was the second, Stephen 
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Hawking the seventeenth, from 1979 to 2009, and the current 
Lucasian Professor is the theoretical physicist Michael Green). 
Saunderson gives credit for the original method to Roger Coates, 
another eminent mathematician at Cambridge, who used it for 
continued fractions. 

Here is how Saunderson’ s algorithm finds the greatest common 
divisor of a = 72 and b = 30: 


a - 0 • b = 72, 
0 • a - b = -30, 
a-2b= 12, 
2a - Sb = —6, 
—2a + 5b = 6. 


After writing down the first two equations, which merely record the 
two numbers, we notice that 72 divided by 30 yields a quotient of 2, so 
we multiply the second equation by 2 and add the result to the first 
equation. This produces the third equation. Now, repeat the process 
with the second and third equations. Again the quotient of 30 divided 
by 12 is 2, so we multiply the third equation by 2 and add the result to 
the second equation. This produces the fourth equation. Since the 
next quotient, ^ is an integer, we are done; however, because we prefer 
our greatest common divisor to be positive we multiply this last 
equation through by -1 to arrive at the final equation. Thus, 6 is the 
greatest common divisor. 

A huge advantage of Saunderson’s algorithm is that it immediately 
produces the greatest common divisor as a linear combination of the 
two numbers; for example, as soon as we discover that 6 is the greatest 
common divisor above we also know that 6 = -2(72) + 5(30). 

Use Saunderson’s algorithm to find the greatest common divisor of 
2001 and 1984. 

3.38 (S) In Problem 2.22 we saw that being able to find the prime 

decomposition for an integer is one of the necessary skills needed to 
solve KenKen puzzles. Another number system that shares with the 
integers the property of having unique factorization is the subset of the 
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Figure 3.3 


complex numbers known as the Gaussian integers, denoted by Z[i], 
which are all complex numbers of the form a + bi where a and b are 
integers. Use Gaussian integers to solve the three KenKen puzzles in 
Figures 3.1 to 3.3. 
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(a) Solve the 4x4 KenKen puzzle of Figure 3.1 using the four 
numbers 1, 1 + i, 1 - i, and 2. 

(b) Solve the 4x4 KenKen puzzle of Figure 3.2 using the four 
numbers 1, 1 + i, 1 - i, and 2. 

(c) Solve the 6x6 KenKen puzzle of Figure 3.3 using the six 
numbers 1, 1 + i, 1 - i, 1 + 2 i, 1 - 2 i, and 2. 


Diophantus 


We now return to our story of the development of modern number 
theory by looking at how it came about that Fermat first got interested 
in numbers. It all happened because of a single very remarkable book. 
And, because of the enormous influence that this book has had upon 
the path that number theory has taken, we will study this book and its 
author in some considerable detail. 


The Arithmetica 

Although many of the early discoveries in number theory were made 
in various regions throughout the world, often quite independently, 
modern number theory began when the remarkable work of the mid- 
third-century Greek mathematician Diophantus was rediscovered in 
the West in the early seventeenth century by Pierre de Fermat. 

Greek mathematics— completely lost to Europe for roughly seven 
hundred years — began to be known again in Europe only by the twelfth 
and thirteenth centuries because of Latin translations that became 
available along trade routes into Europe from the Islamic East. Islamic 
mathematicians and translators had been preserving and extending 
Greek mathematics, as well as absorbing influences from India, all these 
years in their great centers of learning such as Cairo, Baghdad, and 
Damascus. 

The Arithmetica of Diophantus was originally a work consisting of 
thirteen books, though only ten of these survive — six books in Greek 
have long been known, but only recently four more books in Arabic 
translation have also been found. Much of the content of the Arith- 
metica was known in Europe by the fifteenth century largely due to the 
efforts of the most influential mathematician of that century, Johann 
Muller, who is usually known by the name Regiomontanus (which is 
Latin for “from Konigsberg”). 

By 1575, the Arithmetica had been translated into Latin, and subse- 
quently into French, and Rafael Bombelli had substantially revised his 
Algebra before its publication in 1572 as a result of reading a manuscript 
of Diophantus in the Vatican library and even included 143 problems 
taken directly from the Arithmetica. But the most famous translation — 
and certainly the most important, at least until a definitive translation 
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Figure 4.1 Title page of 
Bachet's 1621 edition of the 
Arithmetica by Diophantus. 


DIOPHANTI 

ALEXANDRINI 

ARITHMETICORVM : 

LIB R I SEX. 

ET D E NFMERIS MVLTANCjVLlS 
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AVCTOR.E CLAVDIO GASPARE BACHETO 
MEZIRIACO S E 8 V S l A N O.V.C 


LVTETIAE PARISIORVM, 
Sumptibus Sebastiahi Cramoisy, vis 
I ac ob^i.fub Ciconi ii. 

mTdc. xxi. 

Cr-'M T> KiriLECIO R EG I A 


appeared in the late nineteenth century — was the 1621 edition by 
Claude Gaspard Bachet de Meziriac. In 1612, Bachet had published 
an extremely successful book on recreational mathematics, Problemes 
plaisants et delectables qui se font par les nombres, a collection of mathe- 
matical puzzles involving numbers. Bachet’s lifelong fascination with 
numbers made it inevitable that in his translation of Diophantus the 
commentary would emphasize that which interested Bachet the most, 
namely, questions involving numbers. 

Thus it was, during the 1630s, that Pierre de Fermat was introduced 
to the Arithmetica of Diophantus. The image we have of Fermat in those 
years is one of him sitting at home evening after evening as he pored 
over his copy of Bachet’s translation, carefully writing notes in the 
margins of the book as he read. 

Little is known about Diophantus himself, although it is generally 
believed that he lived in Alexandria around A.D. 250. A few scant pieces 
of biographical information concerning his life appear to be contained 
in a work known as the Greek Anthology from the fifth or sixth century. 
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A portion of this work consists of more than forty mathematical prob- 
lems; one of these problems does at least offer a glimpse, perhaps 
imaginary, into the life of Diophantus (see Problems 4.1 and 4.2). 


Problems from the Arithmetica 

The Arithmetica is a collection of problems — the known Greek books 
contain 189 problems— and even though the solutions presented by 
Diophantus are always quite specific, his solutions do tend to suggest 
general methods. As a result, Diophantus has often been called the 
father of algebra, in part because of these methods, but also because of 
the systematic use of notation and terminology that he introduced in 
this work. For example, even though he did not have the notation we 
now use for exponents, he nonetheless had his own effective symbolic 
way of representing polynomials. But the spirit of the Arithmetica has 
far more in common with modern number theory than with today’s 
practice of algebra. 

Let us look at a few of these problems. The idea is merely to try to get 
a feeling for the way in which Diophantus approached these problems, 
thereby gaining a glimpse of the true nature of this remarkable work. 

Problem 27 from Book I: to find two numbers such that their sum 
and product are given numbers. 

Diophantus solves a particular instance of this problem by taking 20 
as the given sum, and 96 as the given product. At this point, of course, 
we would let a and h be the two numbers, write a + b = 20 and ab = 96, 
and then solve these two equations simultaneously. 

Diophantus prefers to use a single unknown x; and he rather cleverly 
decides to let 2x be the difference of the two unknown numbers. Then, 
the two unknown numbers are given by 10+x and 10 -x (we know their 
sum is 20, so 10 must be midway between the two numbers). Hence 
their product is given by (10 + x)(10 - x), that is, by 100 - x 2 — 96. 
Therefore, x = 2, and the two numbers are 12 and 8. 

Note that at the point where Diophantus wrote the difference be- 
tween the two numbers as 2x, it may seem to us that he is assuming that 
the difference is even. But Diophantus was perfectly happy if x turned 
out to be a rational number. This can be seen in Problem 4.3, where the 
given sum is 13 and the given product is 40. 

Problem 7 from Book I: from the same number to subtract two 
given numbers so that the remainders will have a given ratio to 
one another. 
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Diophantus solves this by taking the two given numbers to be 20 and 
100, and the given ratio to be 3:1. By letting x be the unknown number, 
x - 20 = 3(x - 100), and solving, he gets x = 140. 

Well, at least that’s a quick modern version of his solution. Let’s take 
a closer look at the style Diophantus actually used in the Arithmetica. 
Here is a literal translation by I. E. Drabkin of his solution. 

Let the numbers to be subtracted from the same number be 100 
and 20, and let the larger remainder be three times that of the 
smaller. 

Let the required number be lx. If I subtract from it 100, the 
remainder is lx - 100 units; if I subtract from it 20, the remainder 
is lx - 20 units. Now the larger remainder will have to be three 
times the smaller. Therefore three times the smaller will be equal 
to the larger. Now three times the smaller is 3x - 300 units; and 
this is equal to lx - 20 units. 

Let the deficiency be added in both cases. 3* equals 1* + 280 
units. If we subtract equals from equals, 2x equals 280 units, and * 
is 140 units. 

Now as to our problem. I have set the required number as lx; it 
will therefore be 140 units. If I subtract from it 100, the remainder 
is 40; and if I subtract from it 20, the remainder is 120. And the 
larger remainder is three times the smaller. 

It does make you appreciate the convenience of the techniques of 
modern algebra, doesn’t it? Next is a problem from one of the four 
recently discovered books in Arabic translation: 

Problem 3 from Book IV(Arabic): we wish to find two square 
numbers the sum of which is a cubic number. 

We present Diophantus’s solution in his own words: 

We put x 2 as the smaller square and 4x 2 as the greater square. 

The sum of the two squares is 5x 2 , and this must be equal to a 
cubic number. Let us make its side any number of xs we please, say 
x again, so that the cube is x 3 . Therefore, 5x 2 is equal to x 3 . As the 
side which contains the x 2 s is the lesser in degree, we divide the 
whole by x 2 ; hence x is equal to 5. Then, since we assumed the 
smaller square to be x 2 , and since x 2 arises from the 
multiplication of x— which we found to be 5— by itself, x 2 is 25. 
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And, since we put for the greater square 4x z , it is 100. The sum of 
the two squares is 125, which is a cubic number with 5 as its side. 
Therefore, we have found two square numbers the sum of which 
is a cubic number, namely 125. This is what we intended to find. 

The choice Diophantus makes above to use x 2 and 4a: 2 for the two 
squares may seem arbitrary, but he is in fact illustrating a fairly general 
method. Many other solutions could be found by using other pairs such 
as x 2 and 9x 2 , which would yield a cubic number with 10 as its side. 


A Note in the Margin 

We now come to the most celebrated problem of Diophantus. Fermat 
found much that inspired him in the Arithmetic a, but it was this partic- 
ular problem that was to evolve into one of the greatest of all mathe- 
matical problems, a problem that would inspire every mathematician 
from Fermat to the present day. As an isolated problem it sounds simple 
and straightforward, yet the effect it has had on the mathematical world 
cannot be measured. 

Problem 8 from Book II: to divide a given square number into two 

squares. 

Diophantus as usual solves a particular instance of this problem by 
taking 16 as the square number to divide into two squares. Thus we 
immediately recognize Problem 8 as a very familiar kind of problem 
about Pythagorean triangles; however, note that he did not take 25 as 
his square number to divide, since this would have provided a solution 
that was much too easy: 9 + 16 = 25; instead, he is trying to solve 
x 2 + y 2 = 16. 

Diophantus lets the first square be x 2 . The second square will be 
16 - x 2 , which he takes to be of the form (mx - 4) 2 . This seems a little 
strange to us, but remember, this is his method, not ours. 

Next he chooses m = 2, saying of this square: “let the side be 2x - 4.” 
So (mx - 4) 2 = 4x z - \6x + 16 = 16 — x 2 . Then, 5x 2 = I6x, and x = 

He concludes that one square, x 2 , will be — ; the second square, 16 - x 2 , 
will be and their sum is 4^2, or 16. 

Once again, while the choice Diophantus makes in his solution to 
let m = 2 may seem arbitrary, other solutions could be found using the 
same method. For example, m= 5 yields (ff) 2 +(ff) 2 = 16asa solution, 
whereas m = 3 yields the same solution that Diophantus found using 
m = 2. 

While he was reading Problem 8 in the Arithmetica, Fermat real- 
ized that although squares can be divided into two squares — that is, 
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the equation 


Z 2 = X 2 + y 2 

has solutions such as the one found by Diophantus— the same thing is 
not true for cubes or any higher power— that is, equations such as 

z 3 = x 3 + y 3 and z 4 = x 4 + y 4 , 

and so on, for higher powers, have no such solutions, other than trivial 
ones (where one term is 0) — Fermat did not consider solutions such as 
x — 3, y = 0, z = 3 to be solutions at all. 

Thus, at some point quite early in his career— during the 1630s— 
Fermat added the following note in the margin next to Problem 8 in 
his copy of Arithmetica, the 1621 Bachet translation of Diophantus: 

On the other hand, it is impossible for a cube to be written as a 
sum of two cubes or a fourth power to be written as a sum of two 
fourth powers or, in general, for any number which is a power 
greater than the second to be written as a sum of two like powers. 

I have a truly marvelous proof of this proposition which this 
margin is too narrow to contain. 

Fermat’s excuse that the margin was simply too small for his proof 
(“ Hanc marginis exiguitas non caparet.”) is so well known now that it is 
often used by mathematicians and students in a joking way in situa- 
tions where they wish to avoid having to provide an actual proof of 
something, either because it would just be too much trouble, or more 
likely because they might be bluffing and don’t really have a proof. 

We will never know whether Fermat had anything substantial to 
support this claim of a “truly marvelous proof of this proposition.” 
Perhaps he thought he had a proof at the time. We do know that Fermat 
never made this same claim again throughout his long career. 
Nonetheless, this bold conjecture by Fermat that the equation 

x n + y n = z” 

has no nontrivial solutions for any integer n > 2 has long been known as 
Fermat’s last theorem in spite of the not insignificant detail that it would 
remain an unproven conjecture for almost four hundred years! It is called 
his “last” theorem, not because it was the last theorem of his career, but 
because it was left still unfinished and unproven at the end of his life. 
(See Problem 4.7 for an explanation of what we mean by a “nontrivial” 
solution to this equation.) 
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No problem in all of mathematics has attracted more attention, or 
had more of an effect on the course of mathematics than Fermat’s 
last theorem. It all began with Diophantus, and with Problem 8 of 
Book II of the Arithmetica. We will return to Fermat’s last theorem in 
Chapter 7. 


Diophantine Equations 

In this section, we begin a study of a type of problem that goes by 
the general name Diophantine equations. The term Diophantine has 
obviously been inspired by the fact that these problems are related to 
the kind of problems that Diophantus dealt with in the Arithmetica, but 
as we shall see, the modern study of Diophantine equations differs in 
several important respects from the work of Diophantus. 

We will focus our attention for the moment on just two examples of 
Diophantine equations: 

12x + 30y = 24 and x 2 - 2y 2 = 1. 

We are looking for all integer solutions to these equations. 

Let’s think about how different this approach to these problems 
is from that of Diophantus to his problems in the Arithmetica. He 
was not at all interested in linear equations, and the first equation 
above is a linear equation. He never used two unknowns to solve his 
problems, and both of these equations have two unknowns. He was 
always looking for a single positive, rational solution, but we want only 
integer solutions, including any negative ones; moreover, we want all 
solutions. 

So, you can see that the term “Diophantine” when applied to an 
equation is somewhat of a misnomer. The main thing you need to 
be aware of is that someone using this term is indicating that any 
solutions for the equation should come from a specific set of numbers. 
Traditionally, this term indicates solutions are to come from the set of 
integers, or the positive integers, or the rational numbers. 

Now, let’s investigate the first Diophantine equation 

72x + 30y = 24, 

which is a linear equation in two variables. Well, as it happens, it is easy 
to spot a solution by inspection: x — 2.y = -4, since 72(2) + 30(-4) = 
144 - 120 = 24. Are there other solutions? 

Thinking about this geometrically is very helpful. In the xy-plane 
any point that lies on the line 12x + 30y = 24 satisfies this equation. 


Diophantus 


97 


Since x = 2 and y = - 4 is a solution, the point (2, -4) must lie on this 
line. But we're not interested in all the points on this line. For example, 
the point (g, |) is on this line, but x = y = | is not an integer solution 
to the Diophantine equation. We are only interested in points on this 
line whose coordinates are integers. We call such points lattice points. 
Are there any other lattice points on the line 72x + 30v = 24 besides 
(2, -4)? 

If we write the equation of this line in the form y = — ^x + |, we see 
that the slope of this line is — y (at this point you might find it useful to 
graph this line). So, if we go from the point (2, -4) on the line “5 units 
to the right and down 12 units,” we will be at the point (7, -16), and 
still on the line; and we can easily confirm that x = 7 and y = - 16 is also 
a solution to 72x + 30 v = 24, since 72(7) + 30(-16) = 504 - 480 = 24. 

Obviously, we can repeat this to find another lattice point on the line 
by going from the point (7, -16) on the line “5 units to the right and 
down 12 units” to the point (12, —28). In fact, we can find infinitely 
many lattice points on the line this way and, hence, infinitely many 
solutions to the equation 72x + 30v = 24. 

Similarly, if we go "left 5 units and up 12 units” from (2, -4) we 
will be at the lattice point (-3, 8), which is another solution to this 
equation, since 72(-3) + 30(8) = -216 + 240 = 24. So we can also find 
infinitely many solutions to the equation 72x + 30y = 24 by finding 
lattice points along the line in this direction too. 

We can even list all these solutions in a very convenient way by 
letting t represent the number of times we took a step of “5 units to the 
right and down 12 units” from the point (2, -4) on the line: 

x = 2 + 5f and y = - 4 - 12f, 

for the various integer values of t. Note that when t = 0 we are at the 
point (2, -4), and for t = 1, 2, 3, ... , we are moving down the line to 
find lattice points, and for t — —1, —2, —3, . . . , we are moving up the 
line to find lattice points; this is because the slope of this line happens 
to be negative. 

Are there any other solutions to the equation 72x + 30y = 24 besides 
these? It should be clear that the answer is no. We will prove this 
rigorously in the next theorem, but any other solution would have to 
correspond to a lattice point P that is on the line and falls between two 
of the lattice points we just found, and this is impossible because the 
step sizes we used were 5 and 12, which are relatively prime. 

Now we are well on our way to having a general method for solving 
linear Diophantine equations in two variables. To summarize, what we 
know so far is that if we know one solution, then we can use the slope of 
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the line to write down all solutions in terms of a variable t, where t can 
take on any integer value. We call t a parameter in this situation. 

Of course, a key thing we still need to talk about before we can claim 
to have a general method is how to come up with that very first solution. 
How do you find a solution like x = 2, y = -4 for the equation 
72* + 30v = 24 in the first place? Saying “do it by inspection” is not 
a general method, even though that is how we did it in this case. It is 
true that inspection is frequently the quickest and easiest way to find 
that first solution, but we still need a sure-fire technique to fall back on 
when necessary. 

Here is a technique that will never fail. You may have recognized the 
two numbers 72 and 30 in the equation 72* + 30 y = 24 from Chapter 3. 
In that chapter we found the greatest common divisor of 72 and 30 to 
be 6. More important, we also learned there that we could write that 
greatest common divisor as a linear combination of 72 and 30, and in fact, 
using the Euclidean algorithm, we wrote 


6 = 5 • 30 - 2 • 72. 


So, if we let* = -2andy = 5, then we have a solution to the equation 

72* + 30y = 6. 

This equation isn’t quite the same as our original equation, but it is clear 
that if we simply multiply these values of * and v by 4, then we will have 
a solution to the original equation 72* 4 - 30y = 24. Therefore, we can 
take 


* = -8, y = 20 

as a first solution for our equation. Note that not only can we confirm 
that this is a solution, since 72(-8)+30(20) = -576 + 600 = 24, but that 
it is the same solution we found by the previous method for the value of 
the parameter t = — 1 . 

At this point, we do have a completely general method for solving 
linear Diophantine equations in two variables. If ax + by = c is the 
equation, and if d = gcd(a, b), then we use the Euclidean algorithm— 
or even better, Saunderson’s algorithm (see Problem 3.37)— to find a 
solution to the equation ax + by = d . Then we multiply this solution 
by | to get a first solution for the equation ax + by = c. Once we 
have this first solution, we can use the slope of the line to find all of 
the lattice points on the line and, hence, all of the solutions for the 
equation. 
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Before we formalize what we have discovered about solving linear 
Diophantine equations in two variables one obvious point is worth 
making. A Diophantine equation such as 

72x + 30y = 25 

can’t possibly have a solution. This is simply because gcd(72, 30) = 6, 
and 6/25. If x and y were integer solutions, then 6 \72x + 30y, and this 
would imply that 6 1 25, which is nonsense. 

Theorem 4.1 . Let d be the greatest common divisor of two integers a and b. 
Then the linear Diophantine equation ax + by = c has a solution if and only 
ifd\c. 

Furthermore, if x = X\ and y = yi is a solution for the equation, then all 
solutions for the equation are given by 

x = xi + t(^j and y = y x - t (^j , 

where the parameter t ranges over all integers. 

Proof of Theorem 4.1 

There are several things to prove here: first, that the condition d\c is 
both necessary and sufficient for there to be a solution to the equation; 
then, that each value of the parameter t not only produces a solution 
as described, but that there are no other solutions. That’s a total of four 
things to prove. 

We begin with the necessary and sufficient condition for the exis- 
tence of a solution. 

1. Suppose first that d\c, then we can write c = qd. We know that d 
can be written as a linear combination of a and b, that is, 

d = ra + sb. So 

c = qd = q(ra + sb) = qra + qsb = a(qr) + b(qs), 

and, therefore, x = qr,y = qs is a solution to the 
equation ax + by = c. 

2. Conversely, suppose X\, y\ is a solution to this equation. Then 
c = ax i + by i , and so, since d\a and d\b, we see that d\c by 
property (5) of the basic properties of divisibility on page 66. 

Thus ax + by = c has a solution if and only if d\c. Now we turn 
to the issue of finding the other solutions given that we know one 
solution: x = x\, y = yi- 
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3. First, since d is a common divisor of a and b, we can write a = ud 
and b = vd; that is, u = ^ and v = | (in our previous example, 
these step sizes were u — 12, v = 5). 

Now, for any integer value of t, we claim that 

x = x\ + vt, y = y i - ut 

is also a solution to the equation. To verify this claim, we just plug 
these values of x and y into the equation ax + by and get 

a(x i + vt) + b(y\ - ut) = (ax i + by i) + avt - but 
= c + ( ud)vt - (vd)ut = c, 

and we see that the equation is indeed satisfied. 

Thus we have shown that for any integer t, 

x = x ' +, fy and x = n - t Q 

is also a solution to the equation ax + by = c. 

4. Finally, we have to show that we haven’t missed any solutions. 
Suppose that x , y is a solution to the equation. We must show 
that it is one of the solutions we already found. Since x, y is a 
solution, we know that ax + by = c. But x\ , yi is also a solution, so 
ax i + by i = c. Therefore, ax + by = ax\ + by\. 

So, a(x - xi) = b(yi - y), and, dividing by d, we get 
|(x — x\) = |(yi - >’), which we rewrite as 


u(x-x i) = u(yi -y). 


Now, u and v are relatively prime, since we already divided a and b 
by their greatest common divisor (see Problem 3.17). So, by the 
version of Euclid’s lemma in Problem 3.18, we can conclude that 
w|(yi - y), and so yi - y = tu, for some integer t. Hence 
x - x\ = i ■ v ■ tu = vt. Therefore, 

x = X\ + vt and y — yi — ut, 

and x, y is indeed one of the solutions we already found, exactly 
as desired. 

This completes the proof. g 

Theorem 4.1 is an extremely satisfactory result because it says that as 
soon as you have one solution to a linear Diophantine equation in two 
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variables, then you immediately have all infinitely many solutions (one 
for each value of t e Z). 

The only possible difficulty might be in coming up with that first 
solution. In many cases, such as for the equation 72x + 30y = 24, 
spotting a solution by inspection is fairly easy. Alternatively, we can 
use either the Euclidean algorithm or Saunderson’s algorithm to find 
a solution for ax + by — d when d\c (where d = gcd(a, b)), and then 
multiply that solution by C - A to get a first solution for the equation 
ax + by = c. 


Pell's Equation 

Our second example of a Diophantine equation is x 2 - 2y 2 = 1, which 
is a quadratic equation in two unknowns. As you might guess, solving a 
quadratic Diophantine equation in two unknowns is not going to be as 
easy as solving a linear Diophantine equation in two unknowns. Before 
we even think about solving this equation, let’s talk about why this 
equation is interesting. 

Here is one solution to this equation: x = 577, y = 408, which works 
since 


5 77 2 = 33 2 929 and 2 • 408 2 = 332 928. 

What is really interesting about this solution— among the many solu- 
tions that we could have chosen — is that this particular solution was 
found during the fourth century in India, and then used to build the 
fraction as a rational approximation for V2. Look how close this 
approximation is: 

c 77 

>/2 = 1.414 213 562 ... , — = 1.414 215 686 .. . ; 

408 

that is, it is correct to five decimal places since - V2 & . 000002 . 

It is easy to see why this solution to the Diophantine equation gives 
a good approximation to -J2. If x 2 — 2 y 2 + 1, for a pair of integers x and 
v— that is, if x 2 - 2 y 2 = 1— then ^ = 2 + ^. Therefore, 


x 

y 



*V2, 


since will be close to 0 for large values of y. For, example, when 
y = 408, then j, = .000 006 007 ... . 
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Exactly this same argument can be used to show that, more generally, 
for a Diophantine equation 


x 2 - ny 2 = 1, 

where n is a positive integer, then any solution x and y with a large value 
of y will produce a good rational approximation for Jh. 

Archimedes — who needed to be able to compute square roots 
in his approximation for n — knew this method for approximating 
s/n. 

The Diophantine equation x 2 - ny 2 — 1 is now called Pell’s equation, 
after the seventeenth-century mathematician John Pell, although this 
too is actually a misnomer since Pell had no connection at all with 
equations of this kind. Mathematics abounds with similar instances of 
glitches in the naming process: l'Hopital's rule for computing limits 
and Cramer’s rule for solving systems of linear equations are two that 
you may have run across; and we have already mentioned one famous 
one: Fermat's last theorem. It is mildly troubling that such historical 
inaccuracies exist in the mathematical language, but they cause no 
more trouble to us than does, say, the fact that French fries may have 
originated in Belgium when we are enjoying a good hamburger and 
fries. 

As we indicated, the Diophantine equation x 2 - 2 y 2 = 1 has many 
solutions. In fact, we can find all of them. Here is the beginning of an 
infinite list of all the solutions: 

x = 3,y = 2; x = 17, y = 12; x = 99, y = 70; x = 577, y = 408; .... 
We can verify each of these solutions: 

3 2 - 2 ■ 2 Z = 9 - 8 = 1; 17 2 - 2 • 12 2 = 289 - 288 = 1; 


99 2 - 2 • 70 2 = 9801 - 2 • 4900 = 9801 - 9800 = 1 . 

In order to discover how to find these infinitely many solutions to 
the equation x 2 - 2 y 2 = 1, we need to return to a topic that was 
introduced in the last chapter. Since that topic is continued fractions, 
this is yet another very strong indication that indeed the Pell equation 
x 2 - 2y 2 = 1 is an interesting equation. 
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Continued Fractions 

We saw in the last chapter that there is a very close connection between 
continued fractions and the Euclidean algorithm. The partial quotients 
generated to represent a rational number f as a continued fraction are 
precisely the same quotients generated by the Euclidean algorithm to 
find gcd(a,fi). 

So it is not at all surprising that continued fractions can be used in 
the place of the Euclidean algorithm when solving linear Diophantine 
equations such as 12x + 30y = 24. To see how they can also be used to 
solve equations such as x 2 - 2 y 2 = 1, however, is really quite amazing, 

The key to all this turns out to be a very special property that partial 
quotients have; but first, some additional terminology will be helpful. 

If 


<7i 


<? 2 + • ■ ■ T 


<7»-i + 


<7« + 


is a continued fraction — either finite or infinite — with partial quotients 
< 7 i , < 72 , ■ • • , <7 n . ■ ■ ■ , then the terms 


1 1 

<7i > <?i 1 , <?i H — 

<72 1 

<?2 + — 
<73 


<7i + 


<72 + 


<73 d 

<74 


are called the convergents for the continued fraction. 

Thus, for example, in Chapter 3 we found the continued fraction 
for \/3: 


1 + 


1 + 


1 
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Therefore, the convergents for this continued fraction are 




1 

1 + 2 


, 1 + 


1 + 


1 

~ r 

2+ i 


i + 


i + 


2 + 


1 


1 

1 + 2 


that is, the convergents are 


1 2 5 7 19 

T’ 1’ 3' 4’ TT’ " ’ ’ 


where we wrote the first two convergents as fractions just for conve- 
nience later. 

The first thing we want to observe about these convergents is 
that they really are converging to V3. This is a general fact about con- 
vergents that we will prove in detail in Chapter 14 — an entire chapter 
devoted to the topic of continued fractions — and of course is the 
reason we call them “convergents” in the first place, but for now simply 
observe that for the number s/3 « 1.732, the sequence of convergents 


1 2 5 7 19 

- = 1.000, - = 2.000, - % 1.667, - = 1.750, — 1.727, 

1 1 3 4 11 


does appear to be getting closer to s/3. 

Note also the way in which these numbers are approaching s/3. The 
terms alternate and are either above, or below, >/3. Thus, the odd terms 

1 5 19 

- = 1.000, - « 1.667, — % 1.727, 


approach >/3 from below, whereas the even terms 


2 7 

- = 2.000, - = 1.750. 
1 4 


approach s/3 from above. 

For another example of convergents, we can use the continued frac- 
tion we found in Chapter 3 for the number || : 
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The convergents, therefore, are 


2 , 



2 + 


1 

2+ 2 


that is, j, §, y. Note that again the convergents do actually converge 
to they even alternate as before: 2 < |g, § > |§, and, of course, the 

very last convergent, is the fraction itself since 

We now come to the special property that partial quotients have, or 
we can now say more accurately, the special property that convergents 
have that is so interesting to us in the context of Diophantine equa- 
tions. We will prove this property in Chapter 14, but for now we will 
simply let it jump out at you by looking at two examples. 

Our first, and best, example will be to look again at the convergents 
we found for the continued fraction representation for 


1 2 5 7 19 

I’ I’ 3’ 4’ 11 

Let's look at consecutive pairs of these fractions: 



2 

T’ 




— X — , 

4 11’ 


and observe what happens when we cross-multiply their numerators 
and denominators. 

We get the following computations: 


21 — 11 = 2 — 1 = 1 , 


51-3-2 = 5- 6 = — 1, 


7-3-4-5 = 21-20 = 1, 


19-4 - 11 • 7 = 76- 77 = -1. 


It is not often that patterns are as obvious as this one. The cross-product 
is always either +1 or —1, and it alternates between the two. 

We state this precisely: if you take two consecutive convergents 
and gq then the cross-product of their two numerators and 
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denominators is given by 


a k b k - 1 - b k a k -i = (-1)*. 

What does this have to do with Diophantine equations? Well, we saw 
earlier that the only hard part of solving a linear Diophantine equation 
such as 72* + 30v = 24 is finding a first solution, since it is then easy 
to write down the rest of the solutions using Theorem 4.1. 

Our sure-fire approach for finding a first solution is to use the 
Euclidean algorithm to solve instead the equation 72x + 30y = 6 
(where 6 = gcd(72, 30)). This equation is of course the same as the 
equation 


12* + 5y = 1. 

So, here is how we can find a first solution to this equation by using 
continued fractions instead of the Euclidean algorithm. We already 
know that the convergents for the fraction ~ = ™ are 

2 5 12 

I’ 2’ T' 

The last fraction necessarily contains 12 and 5, the coefficients of the 
Diophantine equation we are trying to solve. So we look at the cross- 
product of that fraction with the fraction immediately preceding it: 



This cross-product gives us 12 • 2 - 5 • 5 - -1, as we knew it would. 
This expression— in spite of the presence of the minus signs— gives us 
just what we are looking for since we can now take x = -2 and y = 5 as 
a solution for the Diophantine equation 12* + 5 v = 1. 

Therefore, what we can tell just from this example is that because of 
the remarkable cross-product property of convergents, a solution for a 
Diophantine equation ax + by — 1 , where a and b are relatively prime, is 
just sitting there as the numerator and denominator of the next-to-last 
convergent, right before the last convergent f. You may have to play 
with the plus and minus signs a bit, but the numbers themselves are 
just sitting there! 

Let's find a solution to 


31* - 24 y = 1 
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just to make sure that this very pretty idea is clear. The continued 
fraction for ~ is 


3 + 


so the convergents are A, So a solution to the equation is just 

sitting there as the next-to-last convergent |, and we can immediately 
write down a solution for our equation: x = 7, y = 9. 

Note that, in this example, checking the solution for the equation, 

31(7) -24(9) = 217 -216 = 1, 

is exactly the same computation as carrying out the cross- 
multiplication of the last two convergents, | and ||, which is why 
we didn’t need to change any signs in the solution for x and y. 

Now, let’s see how the sequence of convergents can be used to solve 
a Diophantine equation such as x 2 - 2 v 2 = 1. that is, a Pell equa- 
tion. Again, we won’t prove anything here, but just discover another 
remarkable pattern. You should already have the idea that this is going 
to involve the continued fraction expansion for ~Jl. 

However, since we have just found the convergents for the continued 
fraction expansion for V3, let’s instead find a few solutions for the Pell 
equation 


x 2 - 3y 2 = 1 

first. The convergents we found for V3 are 

1 2 5 7 19 26 

1’ T’ 3’ 4' IT' 15’ " ’ 

where we added one more convergent to help make the pattern a bit 
easier to see. 

The pattern is that, once again, solutions to the Diophantine equa- 
tion are just sitting there as convergents! This time, every even conver- 
gent provides a solution to the Pell equation x 2 - 3 y 2 = 1 . Thus, the 
convergents 


2 7 26 
I’ 4' 15’ ' ’ 
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all provide solutions for x 1 - 3 y 2 = 1, so 

2 2 — 31 2 = 4 — 3 = 1, 

7 2 - 3 - 4 2 = 49 - 48 = 1, 


26 2 - 3 • 15 2 = 676 - 675 = 1, 


and so on. 

Note, by the way, that the same alternating pattern of + 1 and -1 for 
the cross-products holds here too; so, for example, 

26 11 — 15 19 = 1 and 19 4 — 11 7 = —1 . 

How to solve x 2 - 2 y 2 = 1 is now clear. We need to expand 72 into 
a continued fraction. We could do this using the method presented in 
Chapter 3, but it turns out we have already done this in Problem 3.35, 
because in doing that problem we discovered that 

1 + 72 = 2+ . 

1 

2 + 

1 


Therefore, 


72 


= 1 


2 + 


2 + 


2 + 


It is easy to produce the sequence of convergents for this continued 
fraction: 


1 3 7 17 41 99 239 577 
T’ 2’ 5* 12’ 29' 70’ 169’ 408 

Again, solutions to the Pell equation are just sitting there in the even 
positions! Thus there are infinitely many solutions beginning with the 
fraction using every alternate fraction in this sequence to obtain a 
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solution to the equation x 2 - 2 y 2 = 1: 

3 2 -2-2 2 = 9 — 8 = 1, 

17 2 - 2 ■ 12 2 = 289 - 288 = 1, 


99 2 - 2 • 70 2 = 9801 - 9800 = 1, 


5 77 2 - 2 • 408 2 = 332 929 - 332 928 = 1, 


and so on, forever. 

In the previous paragraph, when we said it was easy to produce 
the sequence of convergents you might have not quite agreed that 
computing a convergent such as was all that easy— incidentally, this 
is the one found in fourth-century India and used to approximate a/2. 
But this sequence really is easy to produce because each fraction can 
immediately be determined from the previous fraction. For example, 
from the fraction ||, you can add 41 + 29 = 70 to get the denominator 
for the next fraction; then you can add the two denominators 29 + 70 = 
99 to get the numerator for the next fraction. 

So, in this way, we could find the next convergent in this sequence, 
not by computing the convergent directly, but simply from the fraction 
fgg, getting 577 + 408 = 985 for the denominator, and then getting 
408 + 985 = 1393 for the numerator. Thus, the next convergent is 

Also, as you build a sequence of convergents, you can use the alter- 
nating pattern of +1 and -1 for the cross-products as a quick means 
of checking your work as you go. For example, we can check the cross- 
product between |g| and When we get 

1393 • 408 - 985 • 577 = 568 344 - 568 345 = -1, 

we can be fairly confident that our newest fraction is correct. 

One final comment about the Pell equation x 2 -2y 2 = l:even though 
we have seen how to find infinitely many solutions, we don’t know that 
this produces all solutions to this equation; however, we will leave that 
discussion until Chapter 14. 
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Problems 

4.1 (S) Most of the problems in the Greek Anthology are very much like some 

familiar-sounding problems that can often be found in algebra books 
used today. For example, consider this problem taken from the Greek 
Anthology: 

how many apples did you start with if you gave apples to six 
people by giving the first person a third of the apples, the second 
person a fourth, the third person a fifth, the fourth person an 
eighth, and ten apples to the fifth person, and the last apple to 
the sixth person? 

Since we are studying number theory in this book, solve this 
problem first by using the notion of prime decomposition. Use 
Theorem 3.2, or Theorem 3.4 (the fundamental theorem of 
arithmetic), explicitly. 

Then pretend you are in an algebra class, and solve this problem 
using straightforward algebra. What do you notice about these two 
methods? 

4.2 (H,S) Assume that the following problem from the Greek Anthology 

accurately relates details from the life of Diophantus. 

This tomb holds Diophantus. Ah, how great a marvel! The tomb 
tells scientifically the measure of his life. God granted him to be a 
boy for the sixth part of his life, and adding a twelfth part to this, 

He clothed his cheeks with down. He lit him the light of wedlock 
after a seventh part, and five years after his marriage He granted 
him a son. Alas! lateborn wretched child; after attaining the 
measure of half his father’s life, chill Fate took him. After 
consoling his grief by this science of numbers for four years he 
ended his life. 

How old was Diophantus when he died? 

4.3 In addition to his solution of Problem 27 in Book I, which we discussed 
in the text, Diophantus also provided a necessary condition for there to 
be a solution to this problem: 

The square of half the sum must exceed the product by a square 
number. 

In the particular instance of the problem that Diophantus solved, the 
sum was 20, so the square of half the sum was 100, and that did exceed 
the product 96 by a square number, namely 4. 
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But what Diophantus meant by the term “square number” is what is 
now called a rational square, and not “square integer” as we would 
normally mean the term today. Show that the condition above is also 
satisfied — using Diophantus’s meaning — in the case where the given 
sum is 13 and the given product is 40. 

Then, solve Problem 27 in Book I using Diophantus’s method in the 
case where the given sum is 13 and the given product is 40. 

4.4 (H,S) Use the same method Diophantus used for Problem 27 of Book I to 

solve the following problem from the Arithmetica : 

Find two numbers such that their sum and the sum of their 
squares are given numbers. 

Then prove that a necessary and sufficient condition for there to be a 
solution for this problem in the integers is that twice the sum of the 
squares exceeds the square of the sum by a square integer. 

4.5 (H,S) Solve Problem 8 of Book II by dividing 9 into the sum of two squares, 

using the method of Diophantus. 

4.6 (S) In his solution for Problem 8 of Book II, Diophantus divided the square 

integer 16 into two rational squares (y) 2 and (y) 2 . Use this solution to 
find a square integer that can be divided into two square integers. 

4.7 In his famous note in the margin of his copy of Bachet’s translation of 
Diophantus, Fermat wrote 

it is impossible for a cube to be written as a sum of two cubes or ... 

and by this he meant that it is impossible for any cube to be written as 
the sum of two rational cubes (and similarly for any higher power). 
Prove that the equation 

X 3 + y 3 = Z 3 

has no solution in the positive rational numbers if and only if it has no 
solutions in the positive integers. (Here we restrict possible solutions to 
positive numbers in order to avoid such “trivial” solutions as 
x = 7, y = 0, z = 7.) 

4.8 * Fermat’s last theorem says that the equation 


x n + y n = z' 
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has no nontrivial solutions for any integer n > 2. (See Problem 4.7 for 
the meaning of nontrivial in this context.) 

Show that in order to prove Fermat’s last theorem, it is sufficient to 
prove it for the following cases: 

(a) n = 4; 

(b) n is an odd prime. 

4.9 (H) We saw in Problem 1.18 that — while “it is impossible for a cube to be 

written as a sum of two cubes” — it is possible for a cube to be written as 
a sum of three cubes. In fact, the equation 

a 3 + b 3 + c 3 = d 3 

even has a solution, 3 3 + 4 3 + 5 3 = 6 3 , in which a, b, c, and d are 
consecutive positive integers. 

Prove that this is the only solution to this equation consisting of 
four consecutive positive integers by showing that the only solution in 
the positive integers to the equation 

x 3 + (x + l) 3 + (x 4- 2) 3 = (x + 3) 3 


is x = 3. 

4.10 (H,S) The problems in Book VI of the Arithmetica involve right triangles 

and their areas. Solve the following: 

Problem 21 from Book VI(Greek): find a right triangle such that 
its perimeter is a square, while its perimeter added to its area gives 
a cube. 

4.11 (S) Prove the following identities involving triangular numbers. These 

results are all due to Bachet. They are included here as a review of 
previous material, but also as an indication of Bachet’s deep interest in 
triangular numbers. 

( 4 ) tm+n = tm + tn + tntl) 

(b) n 3 + 6t n + 1 =(n+ l) 3 ; 

(c) l 3 + 2 3 + 3 3 + ■ ■ • + n 3 = t 2 . 

4.12 ★ (S) Use the Euclidean algorithm to find a first solution for the 

Diophantine equation 34x - 21y = 1, and then use Theorem 4.1 to 
find all solutions for this equation. 
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Next, show how continued fractions could be used to find a first 
solution for this same equation. Also, find a value of t — in terms of 
Theorem 4.1 — that shows that this particular “first” solution is 
included among your previous list of solutions. 

Do you recognize this continued fraction? If not, look back at 
Problem 3.36. The numbers in the sequence of convergents should also 
look very familiar. 

4.13 * Extend the sequence of convergents {, 239, ; 

for the continued fraction for s/2 using the easy recursive 
method mentioned in the text in order to accomplish the 
following: 

(a) find two more solutions to the Pell equation x 2 - 2y 2 = 1; 

(b) find a rational approximation for \/2 as accurate as your 
calculator produces; thus, with nine-decimal place precision, 
the square of your approximation should be 2.000 000 000. 

4.14 (S) Use continued fractions to find a rational approximation for v/5 that is 

correct to four decimal places. 

Which convergents for this continued fraction appear to provide 
solutions to the Pell equation x 2 - 5 y 2 = 1 ? 

4.15 Use continued fractions to find a rational approximation for s/7 that is 
correct to four decimal places. 

Which convergents for this continued fraction appear to provide 
solutions to the Pell equation x 2 - 7y 2 = 1 ? 

4.16 ★ (H) We have seen that solutions to the Pell equation x 2 - 2 y 2 = 1 can 

be found using continued fractions, and that they occur as the even 
convergents in the sequence of convergents: |, 

|g|, . . . . Also, as we pointed out, computing this sequence is easy 
since you can recursively get from one fraction to the next. Here is 
another way to compute solutions to Pell’s equation. 

Begin with the first solution x = 3, y = 2. Then, all solutions are 
found by computing (3 + 2\[7) n for each n > 1. We will prove why this 
is true later, but for now simply notice that 

x 2 - 2 y 2 — (x + V2y)(x - V2y). 

So, for example, (3 + 2s/7) 2 = 17 + 12^2, and ~ is the fourth 
convergent, and x = 17, y = 12 is the second solution. Next, 
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(3 + 2\/2) 3 = (17 + 12\/2)(3 + 2s/2) = 99 + 70V2, yielding the sixth 
convergent, and the third solution. 

(a) Find the next two solutions to x 2 - 2 y 2 = 1 using this method; 
that is, compute (3 + 2s/2) A and (3 + 2s/2) 5 . 

(b) Use this same method to find the first three solutions to the 
equation x 2 - ly 2 = 1. 

4.17 (S) We can use the technique of expanding s/2 into a continued fraction 
together with Fermat’s method of infinite descent to give an alternate 
proof that s/2 is irrational. Here is that alternate proof. Since it uses 
infinite descent, it is a proof by contradiction. 

Suppose that s/2 is rational, and write s/2 = f . Our goal is to write 
s/2 as another fraction, s/2 = |, where c < a, d < b. This will be the 
infinite descent we are looking for, and our contradiction, since you 
can’t keep getting smaller and smaller numerators and denominators 
forever. 

We will use the technique of continued fractions to find this 
“smaller” fraction The first step in continued fractions is to invert 
the fractional part. In this case, the fractional part is V2 - 1 , so we 
simply invert that to get 


1 

s/2 — 1 


= s/2 + \. 


Now comes the critical step: we replace one of the s/2 terms in this 
recursive expression by g, and solve for the other s/2 term; so, we write 



- V2+ 1 , 


and solve for s/2. 
Thus we get 


s/2 


1 1 _ 1 _ b _ 2 b- a 


and this is the smaller fraction we were looking for! All we have to do 
now is show that (i) 0 < 2 b-a < a, and that (ii) 0 < a - b < b, and 
we’re done. 

Now, since f < 2, we know that a < 2b, and so we see that a - b < b; 
and since f > 1, we have a > b, which means that 2 a > 2b, and so 
2 b-a < a. Similarly, both 2a - b and a — b are positive. 
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Therefore, we have expressed \fl as a fraction ^5^ whose numerator 
and denominator are smaller than the numerator and denominator, 
respectively, in Since we can repeat this same process on this new 
fraction, we see that we can produce an infinite sequence of 
fractions — all equal to V2 — whose numerators and denominators 
steadily decrease, which is clearly impossible (by the well-ordering 
principle). This contradiction shows that ~Jl is irrational. 

It is worth noting that the above recursion formula applies to the 
convergents for the continued fraction for V2. For example, if § = fgg 
and we compute we get |||, which is the convergent immediately 
preceding^. 

Use this same method to prove that V3 is irrational. 

4.18 ★ (S) Find all integer solutions for the following Diophantine equations: 

(a) lOx -3y = 14; 

(b) 66x + 210y = 78; 

(c) 12x + 15y = 20. 

4.19 * (H) Find all positive integer solutions for the Diophantine equation 

lOx + 3v = 72. 

4.20 (H,S) Here is a problem, due to the great eighteenth-century 

mathematician Leonhard Euler, that quite naturally requires us to 
consider only positive— or, at least, nonnegative— solutions to a 
Diophantine equation: 

You have $ 1770, and want to spend all of this money buying cows 

and horses, cows costing $21 each, and horses costing $31. How 

many cows and horses can you buy? 

4.21 (H,S) Find all positive integer solutions for the equation 


( 1 + 7)( 1 + y)(l + 7)=2. 
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We will concentrate our attention in this chapter once again on Pierre 
de Fermat, and in particular focus on his intense fascination with 
questions having to do with sums of squares. We begin with one of 
his most beautiful discoveries, a result he announced to a friend on 
Christmas day, 1640. 


Christmas Day, 1640 

Bachet’s translation of Diophantus opened a new mathematical world 
for Fermat, and he spent the rest of his life exploring it. In this chapter 
we will look at several discoveries that Fermat made within the pages of 
the Arithmetica, but we begin with a topic that was central in the work 
of Diophantus: sums of squares. 

Pierre de Fermat was one of the greatest mathematicians of all time, 
yet by today’s standards he was not a working mathematician; in his 
day such a concept did not even exist, though he might have considered 
himself a “geometer.” Fermat’s professional career was as a jurist at the 
High Court in Toulouse and at a nearby court in Castres. 

In spite of this full-time job, Fermat did much of the groundbreaking 
work on tangents and on maxima and minima problems that would 
lead to the discovery of calculus by Newton and Leibniz later in the 
seventeenth century. He also made equally important contributions 
to physics, such as his discovery of the principle of least time, which 
says that light travels between two points by a path that minimizes the 
amount of time taken to travel between the two points. This principle, 
in turn, implies the familiar laws of reflection and refraction for light. 
Sadly, however, Fermat published almost nothing about his many re- 
sults during his lifetime. 

What we know of the work of Fermat we know only because of 
two people, his son Clemont-Samuel, and a French friar who lived 
in a monastery in Paris. After Fermat’s death in 1665 his son spent 
five years editing his father’s papers before publishing his work in two 
volumes in 1669 and 1670, the latter being an edition of the Arithmetica 
by Diophantus that included forty-eight of the marginal notes made 
by his father in his original copy of Bachet’s translation. This is why 
we know today of Fermat’s most famous marginal note: “I have a 
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Figure 5.1 Title page of 1670 edition of the Arithmetica with observations by 
Fermat. 


truly marvelous proof of this proposition which this margin is too 
narrow to contain.” His son included this observation in this 1670 
edition. 

Marin Mersenne was a French friar living in Paris during the first 
half of the seventeenth century. More important, he was also the center 
of a vast network of scientists and mathematicians spread throughout 
Europe. Much of the excitement and intellectual vigor of that period 
was due to the correspondence that took place among this circle of 
friends — many of whom never met; Fermat, for example, simply did 
not travel. Many of the letters that document this correspondence still 
exist today, letters written to one another by a truly remarkable group of 
men that included Mersenne, Descartes, Fermat, Desargues, Pascal, and 
Roberval. From these letters we have learned much about what Fermat 
knew, and when he knew it. 
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Figure 5.2 Marin Mersenne, 
1588-1648. 

We discover from a letter to Roberval, in 1640, that Fermat knows 
when a number can be written as a sum of two squares, and Fermat 
wrote to Mersenne on Christmas day that same year to tell him that 
every prime of the form 4 n + 1 can be written in only one way as a 
sum of two squares. Fermat’s interest in questions concerning squares 
arose early as a result of his reading of Diophantus. Here is a passage 
from Book III of the Arithmetica : 

It is the nature of 65 that it can be written in two different ways as 
a sum of two squares, as 16 4 - 49 and as 64 4- 1; this happens 
because it is the product of 13 and 5, each of which is a sum of two 
squares. 

What is fascinating about this passage— besides it telling us that Dio- 
phantus considered such questions— is that it means that Diophantus 
was aware of the following fundamental identity about sums of squares 
(see Problem 5.1): 

( a 2 + b 2 )(c 2 + d 2 ) = ( ac 4= bd) 2 4- {ad 4 : be) 2 . 

Here we are using the notation ± and =f in a standard way to mean that 
when the top symbol is used in one place, then the top symbol is also 
used in the other place, and similarly for the bottom symbols; in other 
words, we are really writing two identities at the same time. Thus, this 


Fermat 


119 


particular identity says that if you can factor a number as on the left, 
then you get two different representations of the number as a sum of 
two squares on the right. 

So, in the example cited by Diophantus, 65 = 13 • 5, and, as he says, 
writing each of 13 and 5 as a sum of two squares as 13 = 2 2 + 3 2 and 
5 = l 2 + 2 2 , we get, using the top choice for ± and , 

65 = 13 ■ 5 = (3 2 + 2 2 )(2 2 + l 2 ) = (3 • 2 + 2 • l) 2 + (3 ■ 1 - 2 • 2) 2 , 
and, using the bottom choice for ± and =f, we get 

65 = 13 • 5 = (3 2 + 2 Z )(2 2 + l 2 ) = (3 • 2 - 2 • l) 2 + (3 • 1 + 2 • 2) 2 , 
and so we get two different representations of 65 as a sum of two squares: 

65 = 8 2 + l 2 and 65 = 4 2 + 7 2 . 

The fundamental identity about sums of squares that we just used 
was first published by Fibonacci in Liber quadratorum in 1225, and so 
we shall always refer to it as the fundamental identity of Fibonacci. This 
remarkable identity means that our focus can turn to the question of 
which primes can be written as a sum of two squares, because if the 
prime factors of a number, such as 65, can be written as a sum of two 
squares, then so can the number. When Fermat raised the question as to 
which numbers could be written as the sum of two squares — a question 
also posed independently by Albert Girard in 1629 — he was led directly 
to prime numbers, and to one of his most beautiful theorems. 

To allow you the pleasure of discovering Fermat’s beautiful 
theorem — to experience for yourself the experimental nature of num- 
ber theory — here is a list of the primes less than 100: 

2, 3, 5. 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 

43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97; 

and here also is a list of the squares less than 100: 

1, 4, 9, 16, 25, 36, 49, 64, 81. 

Take a few minutes to decide which of these primes can be written 
as a sum of two squares, and which can't. For each prime it takes almost 
no time at all to decide: for example, for the prime 41, you try the square 
36, but since 5 isn’t square, you go on to the next smaller square, 25, and 
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see that it works, since 25 + 16 = 41 . For the prime 31, you try the square 
25 first, but 6 isn’t a square; so you try the square 16 next, but 15 isn’t a 
square, and you can stop since the remaining squares are too small for 
two of them to add to 31; so, 31 can’t be a sum of two squares. This brief 
exercise is well worth doing if you want to get a feeling for what is so 
remarkable about Fermat’s theorem on primes written as a sum of two 
squares. 

Here are the primes less than 100 that cannot be written as a sum of 
two squares: 

3, 7, 11, 19. 23, 31, 43, 47, 59, 67, 71, 79, 83, 
and here are the ones that can be written as a sum of two squares: 

2, 5, 13, 17, 29, 37, 41, 53, 61, 73, 89, 97. 

The thing to do next is to see what it is that distinguishes these two lists 
of numbers (ignoring 2, since it is the “oddest” prime— as an old joke 
goes). 

It doesn’t take long to realize that every prime in the first group is of 
the form An + 3, and every prime— except 2— in the second group is of 
the form An + 1. It is now obvious what Fermat’s theorem is going to be! 

Moreover, it is even quite easy to see why primes in the first group 
cannot be a sum of two squares. In fact, we already saw this in Chapter 1 , 
since any square x 2 must have the form either An or An + 1. So the sum 
of two squares can never have the form An + 3. (Recall the proof from 
Chapter 1: if x is even, then x 2 = (2k) 2 - A(k z ); or, if x is odd, then 
x 2 = (2k + l) 2 = A(k 2 + k) + 1 .) 

Half of Fermat’s theorem thus is easy: primes of the form An + 3 are 
never the sum of two squares. But what about the other half of the 
theorem? Did you have a feeling of surprise when all of the other primes 
seemed to work? Take 89, for example. When you start with 89 you find 
only three chances for it to work, since you have to use one of the three 
primes greater than 36. One chance is 81, but 8 is not a square; another 
chance is 49, but then 40 is not a square either; so, isn't it amazing that 
the last chance, 64, just happens to work, since 25 is a square? Why does 
this keep working, for prime after prime after prime of the form An + 1, 
when the odds seem so poor? 

Here is Fermat’s amazing theorem, the theorem he so proudly an- 
nounced to Mersenne in his letter on Christmas day in 1640. For 
the moment we will have to leave the proof half finished. The first 
published proof was not to come for another century, and we will be 
able to present that proof in the next chapter. However, we will also 
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shortly present a version of the proof that Fermat himself undoubt- 
edly had, but first we need to learn a bit more about what Fermat 
knew. 


Theorem 5.1 . An odd prime p can be represented as a sum of two squares if 
and only if p has the form An + 1. Moreover, this representation is unique. 


In other words, not only can every prime of the form An + 1 be 
written as the sum of two squares, but there is only one way to do 
it. Fermat called his theorem the fundamental theorem on right-angled 
triangles (see Problems 5.5 and 5.6). We will soon return to a proof 
of this theorem, and to the larger question of which numbers are the 
sum of two squares, but first we look at one of Fermat’s most famous 
results. 


Fermat's Little Theorem 

In this section, we will discuss the theorem that is known as Fermat's 
little theorem in order to distinguish it from his “big” theorem — that is, 
Fermat’s last theorem. He discovered this theorem as a result of trying 
to factor numbers such as 2 37 - 1 . We will explain why he was interested 
in doing such a thing later, but for now just imagine how difficult this 
was. These days, we can factor 2 37 - 1 = 137 438 953 471 in the blink 
of an eye using a computer, and even a hand calculator can do it in a 
second or two, but in Fermat’s day this was a challenging problem to 
say the least. Perhaps 137 438 953 471 is prime. 

Here is how Fermat factored this large number once he discovered 
his “little” theorem. This theorem told him that if a prime p divides 
2 37 - 1, then 37 must divide p - 1. In other words, p - 1 = 37 n, 
that is, p = 37 n + 1. But p is an odd prime, which means that p = 
37(2 k) + 1 = 7Ak + 1. So now Fermat just tries all primes of the form 
7Ak + 1 to see whether they divide 137 438 953 471; that is, he tries a list 
of possible prime divisors: 149, 223, 593, . . . , which is a far, far better 
strategy than simply trying all primes 2, 3, 5, 7, 11, 13, ... . 

In this case, Fermat quickly discovered that 149/137 438 953 471, 
but that 223 1 137 438 953 471, since 137 438 953 471 = 223 ■ 

616 318 117. Of course, Fermat was quite fortunate he found a divisor so 
early in this list (we will learn later that there could have been at most 
only about 780 primes on the list for him to check in any event). We will 
come back to why Fermat was interested in numbers such as 2 37 - 1 in a 
moment, but let’s see how it led him to his theorem. 
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Fermat explained the idea in a letter to Bernard Frenicle de Bessy in 
1640 using a geometric progression 

3, 3 2 = 9, 3 3 = 27, 3 4 = 81, 3 s = 243, 3 6 = 729, . . . , 

where he takes as an example the prime number 13, and says that 13 
divides 27 — 1, and then points out two things: 

First, that the exponent for 27 = 3 3 is 3, and 3 divides 12, which is 
13-1. 

Second, he points out that the next place in the progression 
where this happens is at 729, because the exponent of 729 = 3 6 is 
6, which is the next multiple of the exponent 3. 

And, indeed, we can see that 13 divides 729 - 1, since 728 = 13 • 58. 
In this way, 13 will divide any of these powers minus one whenever 
the exponent is a multiple of 3. At this point, you should verify with 
a calculator that 

13 | 3 9 — 1 , 13 | 3 12 - 1, 13 | 3 15 — 1, 13 | 3 18 — 1. 

Let’s do another example using the geometric progression 

8, 8 2 = 64, 8 3 = 512, 8 4 = 4096, 8 s = 32 768, . . . 

and the prime number 31. Here we see that the first value of n for which 
31 1 (8” — 1) is n = 5, and 8 5 — 1 = 32 767 = 31 • 1057. This means that the 
next occurrence of this will be 31 1 (8 10 — 1), and we can check that indeed 
8 10 - 1 = 1 073 741 823 = 31-34 636 833. Note also that 5 1 (31 - 1) = 30. 
Let’s do one final example using the geometric progression 

2, 2 2 = 4, 2 3 = 8, 2 4 = 16, 2 s = 32, 2 6 = 64, ... 

and the prime number 19. By now you should realize that if n is the first 
exponent such that 19 | (2" - 1), then n needs to divide (19 - 1) = 18, 
in other words, n = 1, 2, 3, 6, 9, or 18. The first four values of n clearly 
don’t work, and also 19 ^ 511 = 2 9 - 1. Our last chance it would seem is 
n = 18. It should be no surprise, then, that 19 1 262 143 = 2 18 - 1, since 
262 143 = 19 -13 797. 

Here is the way Fermat stated his theorem (only slightly paraphrased) 
in his letter to Frenicle (Fermat, like Euclid, used the phrase “measures” 
where today we would say “divides”): 

Every prime number infallibly measures one of the powers minus 
one of any progression whatever, and the given prime number 
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minus one is a multiple of the exponent of the said power; and, 
after one has found the first power which answers this question, 
all those whose exponents are multiples of the exponent of the 
first one all answer to the question. 

Fermat then told Frenicle, “I would send you the proof if I did not fear it 
would be too long.” 

Here is the way we state Fermat's little theorem these days. 

Theorem 5.2 (Fermat's little theorem). Ifp is a prime and does not divide 
a, then 

a p ~ x = 1 (mod p). 

Before we prove this theorem, let’s look at an example to see why this 
theorem is true. We can then use this basic idea to present Fermat’s 
proof, in modern language, that almost without doubt was the one he 
feared would be too long to send to Frenicle. In the next chapter we will 
give another, very slick proof of this theorem that was discovered much 
later. 

So, consider again the progression 1, 8, 8 2 , 8 3 , . . . using the prime 
31. Let's write this progression modulo 31. (Fermat thought in terms 
of remainders, but we might as well take advantage of the language of 
congruences to express the same ideas.) We get 

1, 8, 2, 16, 4, 1. 8, 2, 

We see that n = 5 is the first time that 8" = 1 (mod 31) — and, 
therefore, the first time 31 j (8" - 1). We also see that this progression 
will now just repeat itself forever in cycles of 1, 8, 2, 16, 4. 

What we need to see now is why this cycle length 5 had to divide 
30. The idea is to look at the set of all possible remainders, except 0, 
modulo 31: 

{1, 2, 3, ... , 29, 30). 

As we saw, the first five terms 1, 8, 8 2 , 8 3 , 8 4 in the progression all 
have distinct remainders, and form the set {1, 8, 2, 16, 4). Now, we 
can take any remainder not in this set, say 3, and consider the five 
terms 3, 3 • 8, 3 • 8 2 , 3 ■ 8 3 , 3 ■ 8 4 ; these five terms have remainders 
{3, 24, 6, 17, 12). These are five new remainders. 

We now have 10 of the 30 possible nonzero remainders. Do this 
again, taking a remainder we don’t already have, say 5, and the 
five terms are 5, 5 • 8, 5 • 8 2 , 5 • 8 3 , 5 • 8 4 ; and the remainders are 
{5, 9, 10, 18, 20). 
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Now we have 15 of the 30 possible nonzero remainders. Repeat 
this three more times, getting, in turn, three more sets of remainders 
{7, 25, 14, 19, 28}, {11, 26, 22, 21, 13), and {15, 27, 30, 23, 29}. 

So the 30 possible nonzero remainders from 1 to 30 have been neatly 
partitioned into six sets, each set having exactly five elements, with no 
remainders left over, which nicely explains why n — 5 divides p- 1=30. 

The key thing we need to check in our general proof is that each time 
we multiply the terms 1 , 8. 8 2 , 8-\ 8 4 by a new remainder, we don’t get 
any repetitions with previous remainders. 

Proof of Theorem 5.2 

Each term in the progression 1, a, a 2 , a 3 , . . . has a remainder in the set 

{1, 2, 3 p- 1} 

when divided by p. Therefore, since this is an infinite progression, there 
will be a first time in this progression where a remainder repeats, that is, 
where a k+ " = a k (mod p). (In fact, a k will be 1, but we don’t know that 
yet.) 

But a and p are relatively prime, so, by Theorem 3.7, we can divide 
this congruence by a k , and get a 11 = 1 (mod p). Therefore, the remain- 
der of a n is 1, and we know that a k is 1, after all; we also know that each 
of the n terms in the set 


{1, a, a 2 , a 3 , . . . , a" *} 

has a distinct remainder, otherwise a" would not have been the first time 
a remainder repeated. 

Now, from the set of remainders {1, 2, 3 p- 1), we can take any 

remainder r, and form a set 


{r, ra , ra 2 , ra 3 , . . . , ra n x }. 

The first thing to prove about this set is that it has n different elements 
modulo p; that is, each term in this set has a distinct remainder modulo 
p. Suppose that ra' = ra' (mod p). By Theorem 3.7, we can divide this 
congruence by r to get a‘ = a' (mod p), which means that/ = /.So the 
remainders are distinct, and this set has n different elements modulo p. 

The second thing to prove is that we can use these sets — by taking all 
values of r — to partition the set of remainders: {1, 2, 3, . . . , p - 1}. To 
prove this we show that for any two remainders r and s, there are only 
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two possibilities for the remainders in the two sets 

{r, ra , ra 2 , ra 3 , . . . , ra"~ l ) and {s, sa , sa 2 , sa 3 , . . . , sa n ~ 1 } : 

(i) the remainders in the two sets are exactly the same; or 

(ii) the remainders in the two sets are completely different. 

In the latter case, we say the two sets of remainders are disjoint. 
Well, if the sets of remainders aren’t disjoint (that is, if (ii) isn’t the 
case), then ra 1 = sa' (mod p), for some r and s, and we may as well 
assume i < j. Then divide by a‘ to get r = sa j ~' (mod p). This 
means that as we cycle through the remainders in the set on the left: 
r, ra, ra 2 , . . . , we are also just cycling through the remainders in the 
set on the right: sa sa>~ i+1 , sa’~ i+2 .... Thus the remainders in the 
two sets are identical. 

Therefore, we successfully partitioned the set {1, 2, 3, . . . , p - 1}, 
which has p - 1 elements, into sets that each have n elements, and so 
we can conclude that n\(p - 1). 

The final step in the proof is to note that since n\ (p - 1) we can write 

a p ~ x - 1 = (a" - \){aP- l - n + a P- l - 2n + • • • + 1 ); 

and, since a n = 1 (mod p), we know that p | (a n - 1), which therefore 
implies that p |(a p ~ 1 - 1). 

This completes the proof. ■ 

Because Fermat actually said more than we state in Theorem 5.2, 
and because — following Fermat — we in fact proved more, we record the 
following corollary to the proof of Theorem 5.2. Note that it is the 
corollary, not the theorem, that is the heart of the matter. 

Corollary 1 . If p is a prime and does not divide a, and ifn is the least positive 
integer such thata n = 1 (mod p), then n\(p - 1). 

We often like to state Fermat’s little theorem in a slightly different 
form, which we get by multiplying the congruence a p ~ l = 1 (mod p) 
by a. Note that in this form, the congruence remains true even in the 
totally uninteresting case where p\a. 

Corollary 2 (Fermat's little theorem— alternate form). If p is a prime, 
then for any integer a, 


a p = a (mod p). 
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In Problem 5.40, we will ask you to give a different proof of 
Corollary 2 using induction and following the original argument that 
Fermat made for the case a = 2. 


Primes as Sums of Two Squares 

With Fermat’s little theorem in hand, we are now ready to see how 
Fermat finished his proof of Theorem 5.1, or, as he called it, his funda- 
mental theorem on right-angled triangles. In the same letter to Huygens, in 
1659, in which Fermat mentioned using his method of infinite descent 
to prove negative assertions such as Theorem 1.2, his theorem that no 
Pythagorean triangle has square area, he writes: 

For a long time I was unable to apply my method to affirmative 
questions ... so much so that when it occurred to me to prove 
that every prime number which is one more than a multiple of 4 
is a sum of two squares I found myself in a good deal of trouble. 

But finally a line of thought gone over many times showed me a 
light which did not fail, and affirmative questions surrendered to 
my method. 

We do not have the details of Fermat’s proof, only the briefest de- 
scription to Huygens in this letter that a version of infinite descent 
was used. In a standard infinite descent proof you would argue that 
if one prime of the form 4 n + 1 was not a sum of two squares, then 
smaller and smaller primes of the same form could be found, none 
being a sum of two squares, until eventually reaching the smallest such 
prime, 5, which then also would not be a sum of two squares. But, since 
5 = 2 2 + l 2 is clearly a sum of two squares, this contradiction would 
prove the theorem. 

Perhaps you can see why Fermat found himself in “a good deal of 
trouble.” It isn’t clear how you can take a given prime of the form An + 1 
that is not a sum of two squares, and produce a smaller prime of the same 
form that is also not a sum of two squares. Nevertheless, descent can 
still be used here, and to see the main idea let's look at an example. We 
will illustrate a general process by writing the prime 89 as a sum of two 
squares. The first thing we do— and this will require proof that we can 
always do this— is to find a number a such that a 2 + 1 =0 (mod 89)— or, 
equivalently, we can write this as a 2 = -1 (mod 89). 

In this case, a = 34 works, since 34 2 + 1 = 1156 + 1 = 1157 = 13 ■ 89. 
So we can write 


13 • 89 = 34 2 + l 2 . 
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Note that at this point we have a multiple of 89 written as a sum of two 
squares. (This is an “affirmative” statement that is about to “surrender” 
to Fermat's method.) The key idea is that now we will use descent to 
make this multiple smaller at each stage until we finally reach 89 as a 
sum of two squares. 

Next we find the remainder of 34 when it is divided by 13, which is 
8; that is, 34 = 8 (mod 13). We replace the 34 by 8 in 34 2 + l 2 to get 
8 2 + l 2 . 

Note that since 34 2 + l 2 = 0 (mod 13), then 8 2 + l 2 = 0 (mod 13) 
also. Therefore, 13 is guaranteed to divide 8 2 + 1 2 . And it does, so we can 
write 


13 ■ 5 = 8 2 + l 2 . 


Now comes the incredibly clever step in this proof: multiply these 
two equations together 

13 2 • 5 ■ 89 = (34 2 + 1 2 )(8 2 + l 2 ) 

using the fundamental identity of Fibonacci for the sums of squares, 
given on page 118, to get 

13 2 -5-89 = (34-8+1 -l) 2 +(34-l-l-8) 2 = 273 2 +26 2 = (13-21) z +(13-2) 2 , 
which, dividing by 13 2 on both sides, simplifies to 

5 • 89 = 21 2 + 2 2 . 

Thus, descent worked! We now have a smaller multiple of 89 written as 
a sum of two squares. We should be able to repeat the process. 

So we find the remainder of 21 when it is divided by 5, which is 1, and 
we replace the 21 by 1 in 21 2 + 2 2 to get l 2 + 2 2 . Then, since l 2 + 2 2 is a 
multiple of 5, we write 


5- 1 = l 2 + 2 2 . 

Multiplying these two equations together gives us 
5 2 • 89 = (21 2 + 2 2 )(1 2 + 2 2 ), 

and, using the fundamental identity of Fibonacci once again we get 
5 2 • 89 = (21 • 1 + 2 • 2) 2 + (21 • 2 - 2 • l) 2 = 25 2 + 40 2 = (5 • 5) 2 + (5 • 8) 2 , 
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which, dividing by 5 2 , simplifies to 

89 = 5 2 + 8 2 . 

That’s how descent can be used to write a prime as a sum of two 
squares. You start with a multiple of the prime written as a sum of two 
squares, and then use descent until you reach the prime itself written as 
a sum of two squares. 

We are ready to prove one of Fermat’s finest results: Theorem 5.1. The 
first thing we needed to be able to do in our example above was to find 
a number a such that a 2 + 1 = 0 (mod 89). So our first step in the proof 
of Theorem 5.1 is to prove that we can always do that in general. 

Lem ma 1 . Let pbea prime of the form 4n+l. Then there is a number a such 
that a 2 + 1 = 0 (mod p). 

Proof 

We will assume there is no such number, and reach a contradiction. Let 
x be relatively prime to p. 

Then, by Theorem 5.2 (Fermat’s little theorem), x 4n - 1 = x p ~ 4 - 1 is 
divisible by p. (This is where the hypothesis that p has the form An + 1 
is used in the proof!) So we write x 4n - 1 as 

x 4n - 1 = (x 2n - l)((x”) 2 + 1) 

and conclude that p | {x 2n - 1), since by assumption p ((x n ) 2 + 1). 

Therefore, for every value of x in the interval 1 < x < An = p - l, 
x 2n = 1 (mod p); but this is the contradiction we were looking for, 
since this congruence could be satisfied by at most 2 n values of x in this 
interval (it is likely Fermat knew this, though not in the language of 
congruences; and we will prove it rigorously later). 

This completes the proof. ■ 

Proof of Theorem 5.1 

We already know that no prime of the form An + 3 can ever be the sum 
of two squares. Now we have to prove the harder half of the theorem, 
namely, that all primes of the form 4 n+ 1 can be represented as a sum of 
two squares. Then we also must show that this representation is unique. 

Let p be a prime of the form An + 1 . 

So, by Lemma 1, there is a positive integer a — necessarily relatively 
prime to p — such that p \ (a 2 + 1), and we can write 


kp = a 2 + l 2 . 


Fermat 


129 


Further, we may as well assume that a < | . (Replace a by the remainder 
r when a is divided by p, and if r use p - r instead.) 

The idea of the proof is that we have a multiple kp of p expressed as a 
sum of two squares, and we will show that a smaller multiple can also be 
so expressed. Eventually, then, the “multiple” will just be 1 • p, and we 
will be done. 

The next step is that for any square in the current expression — in this 
case that means a 2 or l 2 , but we will call them a 2 and b 2 to be more 
clear in general — replace if necessary a and b by their remainders r and 
s when divided by k, choosing a negative remainder as needed so that 
the remainders fall within the interval [-§, §]. 

Since k \(a 2 + l 2 ), or, in the general step, k \(a 2 + b 2 ), it follows that 
k \(r 2 + s 2 ), since r = a and s = b (mod k), and we can write 

km = r 2 + s 2 . 


The key observation is that 



so, as we will soon see, Fermat’s idea of descent is well in hand, since 
m < k. Now it is time for the clever step in the proof. 

Multiply these two equations, and use the fundamental identity of 
Fibonacci from page 118, to get 

k 2 tnp = ( a 2 + b 2 )(r z + s 2 ) = ( ar + bs) 2 + (as - br) 2 . 

But, since k \(a 2 + b 2 ), it follows that k \ (ar + bs) and k \ (as - br) (see 
Problem 5.12). Therefore, we can divide both sides by k 2 , and get 

mp = c 2 + d 2 . 

Since m < k this completes the descent argument, and completes the 
proof that any prime of the form 4n + 1 can be represented as a sum of 
two squares. 

Showing that this representation is unique is left as an exercise in 
Problem 5.13. This completes the proof. ■ 


Sums of Two Squares 

As mentioned earlier, Fermat knew by 1640 exactly which numbers 
could be written as a sum of two squares. What do we know at this point? 
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We know that no prime of the form An + 3 can be written as a sum of two 
squares. We also know, because of Theorem 5.1 and the fundamental 
identity of Fibonacci, that any number whose prime decomposition 
consists only of primes of the form An + 1 can be written as a sum of 
two squares. 

But what about numbers such as 

2541 = 3 • 7 • ll 2 , 3185 = 5 • 7 2 • 13, 3575 = 5 2 • 11 • 13, 

whose prime decompositions include primes of the form 4 n + 3? Can 
we tell by looking at the prime decomposition of a number whether it 
can, or can’t, be written as a sum of two squares? 

The following theorem tells us how to do that. We do not have 
Fermat’s proof of this result; the proof here is based on that given by 
Euler in 1742, but Fermat surely must have used a very similar argument. 

Theorem 5.3. Let N = a 2 + b 2 be a sum of the squares of two relatively 
prime integers a and b. If p is an odd prime divisor ofN, then p is of the form 
An + 1. 

Proof 

Since p \ (a 2 +b 2 ), and a and hare relatively prime, p cannot divide either 
a or b. 

Assume that p = 4n + 3. Wewill reach a contradiction. Letk = 2n+l; 
thatis, 2k = An+2 = p- 1. First, we claim that p \(a 2k -b 2k ). This follows 
immediately, since we can write 

a 2 k - b 2k = (a p ~ l - 1) - (b p ~ l - 1), 

and, by Fermat’s little theorem, p \ (a p ~ l - 1) and p \{b p ~ l - 1). 

Next, we claim that p | ( a 2k + b 2k ) . This follows from the fact that since 
k is odd we can write 

a 2k + b 2k = (a 2 + b 2 )(a 2k ~ 2 - a 2k ~ 4 b z + a 2k ~ 6 b 4 - • • • + b 2k ~ 2 ), 

and because, by hypothesis, p\{a 2 + b 2 ). 

Now, since p divides both a 2k - b 2 * and a 2k + b 2k , it must also divide 
their sum, that is, p\2a 2k . Thus, p\a, and we have a contradiction. This 
completes the proof. ■ 

Theorem 5.3 completely answers the question as to which numbers 
can be represented as the sum of two squares. If a number N = a 2 + b 2 
is a sum of two squares, then we can factor out the greatest common 
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divisor d of a and b, and write N as 

N = a 2 + b 2 = (d«i) 2 + (dbi) 2 = d 2 (a 2 + b\), 

where fli and hi are relatively prime. But Theorem 5.3 says that no prime 
of the form 4n + 3 can divide a 2 + b 2 , which means that any prime p of 
the form 4n + 3 that divides N must divide d, and therefore p would 
divide N an even number of times. 

Conversely, let N be a number such that in the canonical repre- 
sentation of N any prime of the form 4 n + 3 appears with an even 
exponent. Then, write N = trfn, where m 2 is the part of the canonical 
representation of N that has primes to an even power, and n is the part 
that has primes to an odd power. For example, we would write 

3 4 ■ 5 3 • 7 2 • ll 6 • 13 • 17 2 = (3 2 • 7 • 11 3 • 17) 2 (5 3 • 13). 

Since n consists only of primes that are either 2 or of the form 4n + 1, 
and since each of these can be written as a sum of two squares, we can 
use the fundamental identity of Fibonacci as many times as necessary 
to write n as a sum of two squares, that is, n = a 2 + b 2 . But then 
N = rrfn = ( ma ) 2 + (nib) 2 is also a sum of two squares. Thus we have 
proved the following corollary to Theorem 5.3. 

Corollary 3. A positive integer N can be written as a sum of two squares 
if and only if each prime of the form 4 n + 3 appearing in the canonical 
factorization ofN is raised to an even power. 

Thus 3185 = 5 • 7 2 • 13 can be written as a sum of two squares: 

3185 = 7 2 • 65 = 7 2 (8 2 + l 1 ) = 56 2 + 7 2 . 

But neither 2541 = 3 • 7 • ll 2 nor 3575 = 5 2 • 11 • 13 can be written as a 
sum of two squares since they have primes of the form 4n + 3 raised to 
an odd power: 3 1 , 7 1 , 11 1 . 

Fermat thus was able to answer successfully the question raised by 
Bachet in his translation of Diophantus about when an integer is the 
sum of two squares. But as Fermat himself said, there are infinitely many 
questions of this kind and, as we shall see, much more to come along 
these same lines. 

We know that a number such as 59 can’t be written as a sum of two 
squares, but it can be written as a sum of three : 59 = l 2 + 3 2 + 7 2 . A 
number such as 71 can’t be written as a sum of three squares, but it can 
be written as a sum of four : 71 = l 2 + 3 2 + 5 2 + 6 2 . (Recall from Problem 
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1.16 that Fermat knew that no integer of the form 8«+ 7 could be written 
as a sum of three squares.) 

Perhaps the most obvious next series of questions to ask is when is an 
integer N a sum of three squares, of four squares, and so on? Bachet had 
asserted that every integer was either a square or a sum of two, three, or 
four squares, and he had called for a proof. His call would eventually be 
answered, not by Fermat, or even by Euler a century later, but by Joseph 
Louis Lagrange in 1770. 

Fermat generalized Bachet’s assertion in a very striking way. In a letter 
to Mersenne, in 1638, he makes one of his most remarkable claims (see 
Problem 5.34), stating that every integer is 

the sum of three triangular numbers, of four squares, of five 

pentagonal numbers, and so on. 


Perfect Numbers 

It is clear, now that we have seen the proofs of Theorem 5.1 (including 
Lemma 1) and Theorem 5.3, that Fermat had to have discovered his 
little theorem in order to be able to prove these results about the sums 
of squares. It still remains for us to see what it was that first led Fermat 
to be interested in questions such as whether 2 37 - 1 is prime. After all, 
it was questions of this sort that led him to discover his celebrated little 
theorem. The answer goes all the way back to Euclid. 

The concept of perfect numbers is extremely old, perhaps going back 
to Archytas, one of the last of the Pythagoreans. A number is perfect if it 
is the sum of its proper divisors; so, for example, 6 is a perfect number 
because 6 = 1 + 2 + 3. 

The term proper is used here to exclude the number itself from the 
sum; thus, a proper divisor of a positive integer n is any positive divisor d 
of n other than the divisor n. This is similar to the use of the word proper 
in many other mathematical contexts. For example, we say that a set S 
is a proper subset of a set T if S c T, but 5 ^ T. 

The next perfect number is 28, since 28 = 1 + 2 + 4 + 7 + 14. Then 
come the next two perfect numbers: 496 and 8128. These four perfect 
numbers were known well over two thousand years ago. But the fifth 
perfect number didn’t appear until the fifteenth century! 

Books VII, VIII, and IX of Euclid’s Elements deal with number theory, 
and the very last proposition in Book IX, Proposition 36, is the follow- 
ing extraordinary theorem about perfect numbers. (In Chapter 9 we will 
prove that all even perfect numbers must have the form described in 
this theorem.) 
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Theorem 5.4. If 2" — 1 is prime, then 2" 1 (2" — 1) is a perfect number. 

This amazing theorem of Euclid’s, at least in principle, makes it easy 
to produce perfect numbers, one after another: 

1. For n — 2, 2" - 1 = 2 2 - 1 = 3 is prime, so 


2 2 “ 1 ( 2 2 — 1 ) = 6 

is a perfect number. 

2. For n = 3, 2" - 1 = 2 3 - 1 = 7 is prime, so 

2 3_1 (2 3 — 1) = 4 ■ 7 = 28 


is a perfect number. 

3. For n — 5, 2” - 1 = 2 s — 1 = 31 is prime, so 

2 S ~ 1 (2 5 — 1) = 16 ■ 31 = 496 
is a perfect number. 

4. For n = 7, 2" — 1 = 2 7 - 1 = 127 is prime, so 

2 7 - j ( 2 7 - 1) = 64-127 = 8128 
is a perfect number. 

Why did it take so long to find the fifth perfect number? Well, maybe 
it is because the fifth perfect number is 33 550 336, which is pretty big; 
and to find this perfect number you would need to know that 2 13 - 1 = 
8191 is prime. 

In any event, it is because of this theorem of Euclid’s that Fermat 
became interested in whether 2 37 - 1 is prime. Note that in all the 
examples we have seen so far where 2" - 1 is prime, it is also the case 
that n is prime. Fermat had written to Mersenne in 1640 that 2" - 1 
could be prime only if n is prime (see Problem 5.14). And, in case you 
are beginning to suspect that, conversely, 2 P - 1 is prime whenever 
p is prime, recall that Fermat factored 2 37 - 1 as 223 • 616 318 117. In 
Problem 5.16, you will be asked to show that 2 11 - 1 and 2 23 - 1 are both 
composite, even though 11 and 23 are prime. 

Frenicle had written to Fermat asking for a perfect number between 
10 20 and 10 22 , and since the only prime value of n for which 2”~ 1 (2 tt - 1) 
falls within this range is the value n = 37, Fermat knew he was really 
being asked whether 2 37 - 1 was prime. 
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We will now prove Euclid’s theorem on perfect numbers. Note that in 
our proof we find it convenient to prove a number is perfect by showing 
that the sum of all the positive divisors of the number is twice the 
number, rather than by showing that the sum of the positive proper 
divisors of a number equals the number. Thus we might choose to say, 
for example: 28 is perfect because 1 + 2 + 4 + 7 + 14 + 28 = 56 = 2 • 28, 
rather than to say: 28 is perfect because 1 + 2 + 4 + 7 + 14 = 28. These 
are just two ways of saying the same thing. 

Proof of Theorem 5.4 

We intend to find all the divisors of 2 n ~ 1 (2" - 1), add them, and show 
that this sum is twice 2' ,_1 (2" - 1). 

The divisors of 2' 7 ~ 1 are 1, 2, 2 2 , . . . , 2"~ 1 . The divisors of the prime 
number 2 n - 1 are 1 and 2" - 1. Any divisor of 2" _1 (2" - 1) must be a 
product of a divisor of 2' 1 ” 1 and a divisor of 2" — 1 . 

Hence the sum of all the divisors of 2' 7_1 (2' 7 - 1) is given by the 
product 

(l + 2+2 2 + - ■ • + 2 77 ~ 1 )(l + (2"-l)) = (2" — 1) (1 + (2" — 1)) = (2" — 1)(2"), 

which is exactly twice the number 2" 1 (2 f — 1) itself. Therefore, 2 n ~ 1 (2 n - 
1) is a perfect number. This completes the proof. ■ 


Mersenne Primes 

Interest in finding perfect numbers was so high at the time among his 
circle of correspondents that Mersenne made the absurdly bold claim 
in 1644 that he could list all primes of the form 2 P - 1 for any prime 
p < 257. In spite of the fact that Mersenne could not possibly have 
confirmed his list in detail — the number 2 257 — 1 has seventy-eight 
digits, for example — Mersenne’s list of “primes” of the form 2 P - 1 
has kept mathematicians busy for a long time. It took three centuries 
to finish checking it, and we now know he made only five mistakes 
in the entire list. Thus we call numbers of the form 2 P - 1 that are 
themselves prime numbers Mersenne primes. The search for Mersenne 
primes continues to this day. 

Mersenne listed the following eleven values of p for which he 
claimed that 2 P - 1 was prime: 

2, 3, 5, 7, 13, 17, 19, 31, 67, 127, 257, 

and he asserted that 2 P - 1 was composite for all other values of p 
less that 257. We will have much more to say about this list, and 
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about Mersenne primes in general, but at this point we will temporarily 
conclude our discussion of Mersenne primes by relating one of the most 
picturesque stories involving one of the five mistakes on Mersenne’s 
famous list, the prime p = 67. Here is the story as told by Eric Temple 
Bell, who knew the prominent American mathematician Frank Nelson 
Cole, the star and only character in this quiet little drama. 

At the October, 1903, meeting in New York of the American 
Mathematical Society, Cole had a paper on the program with the 
modest title On the factorization of large numbers. When the 
chairman called on him for his paper, Cole— who always was a 
man of very few words— walked to the board and, saying nothing, 
proceeded to chalk up the arithmetic for raising 2 to the 
sixty-seventh power. Then he carefully subtracted 1. Without a 
word he moved over to a clear space on the board and multiplied 
out by longhand, 

193,707,721 x 761,838,257,287. 

The two calculations agreed. Mersenne’s conjecture— if such it 
was— vanished into the limbo of mathematical mythology. For 
the first and only time on record, an audience of the American 
Mathematical Society vigorously applauded the author of a paper 
delivered before it. Cole took his seat without having uttered a 
word. Nobody asked him a question. 

Cole later told Bell that it took him “three years of Sundays” to crack 
this number. (Today, I can factor 2 67 - 1 with my hand calculator in 2 
minutes 25 seconds, and Sage can do it instantly!) 

Mersenne and others in his circle seem to have become quite preoc- 
cupied with finding numbers with properties similar to those of perfect 
numbers, such as numbers where the sum of the proper divisors of a 
number is a small multiple of the given number. Here is one example 
proposed by Frenicle in 1643 that shows the extreme lengths they went 
to in playing this game: the sum of the proper divisors of the following 
number, as presented as a product of its prime factors, 

2 36 ■ 3 8 • 5 s ■ 7 7 • 11 • 13 2 ■ 19 • 31 2 ■ 43 • 61 • 83 • 223 • 331 

■ 379 • 601 • 757 • 1201 • 7019 • 112 303 • 898 423 ■ 616 318 177 

is five times the number itself (see Problem 5.20). 

Another type of number closely related to perfect numbers that 
attracted the attention of this group of mathematicians are those 


136 


Chapter 5 


numbers that are called amicable pairs. The first pair of amicable num- 
bers, 220 and 284, was known to the Pythagoreans. These two numbers 
are amicable because each is the sum of the proper divisors of the other! 
This is a property that is simultaneously remarkable and peculiar— you 
do have to wonder what in the world made someone think of it in the 
first place, but once you notice it, it does become sort of appealing. 

Let’s use these two numbers to practice some of the fundamentals of 
divisors and their sums. Since 220 = 2 2 • 5 ■ 11, 220 has 3 • 2 ■ 2 = 12 
divisors— 11 of which will be proper divisors— and we can easily write 
them all down: 

2° • 5° • 11° = 1, 2 1 • 5° • 11° = 2, 2° • 5 1 • 11° = 5, 2° • 5° ■ ll 1 = 11, 


2 1 -5 1 -11° = 10, 2 1 -5°-ll 1 = 22, 2°-5 1 -ll 1 =55, 2 1 -5 1 ■ 11 1 = 110, 


2 2 -5°- 11° = 4, 2 2 -5 1 • 11° = 20, 2 2 -5°-ll 1 = 44, 2 Z -5 1 ■ ll 1 = 220. 

The sum of the proper divisors of 220, then, is just 

1 + 2 + 4 + 5 + 10 + 11 + 20 + 22 + 44 + 55 + 110 = 284. 

It is even easier to check 284 because 284 = 2 2 ■ 71, so it has only 
3-2 = 6 divisors: 


1, 2, 2 2 , 71, 2-71, 2 2 • 71. 

Therefore, the sum of the proper divisors of 284 is 
1 + 2 + 4 + 71 + 142 = 220. 

So, indeed, 220 and 284 do have the peculiar property that each is the 
sum of the proper divisors of the other number; that is, they form an 
amicable pair. 

Even though the notion of amicable numbers arose in ancient times, 
the next pair of amicable numbers was not found until 1636 when 
Fermat wrote to Mersenne that 17 296 and 18 416 were an amicable pair 
(see Problem 5.21). Then, in 1638, Descartes, also writing to Mersenne, 
not only produced the next amicable pair, 9 363 584 and 9 437 056, 
but provided a general formula for producing them in the way Euclid’s 
formula produces perfect numbers. His formula is that if each of the 
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three terms in parentheses is prime, then 

2" +1 (18 • 2 2 " — 1) and 2" +1 (3 • 2" - 1)(6 • 2 n - 1) 

are an amicable pair. It seems clear that Fermat was also aware of this 
formula, since the value n = 3 produces his amicable pair 17 296 and 
18 416. This same formula was also known much earlier to an Arab 
mathematician, Thabit ibn Qurra, in the ninth century. 


Fermat Numbers 

In the last two sections we have been focused on the question: when is 
2 n - 1 prime? A very similar question, of course, is: when is 2" + 1 prime? 

For an even number 2 n, factoring 2 2n - 1 involves factoring the two 
terms on the right-hand side of 

2 Zn - 1 = (2" - 1)(2" + 1), 

so it was quite natural for Fermat to become interested in the question 
of when 2 n + 1 is prime. Moreover, Fermat also realized that in order for 
2 n + 1 to be prime, it is necessary for the exponent n to be a power of 2 
(see Problem 5.23). 

This led him to make the following famous conjecture: 

All numbers of the form 2 2 " + 1 are prime. 

In 1640, he wrote to Frenicle with exactly this conjecture, listing the first 
seven such “primes,” that is, for n = 0, 1, 2, 3, 4, 5, 6: 

3 , 5, 17, 257, 65 537, 4 294 967 297, 18 446 744 073 709 551 617, 

saying 

I don’t have an exact proof, but I have excluded such a great 
quantity of divisors by infallible demonstrations and I have shed 
so much light on the problem which will establish my idea, that 
it would be difficult for me to retract. 

Fermat made his conjecture based largely on the extremely tempting 
evidence that the first five of these numbers, 


2 1 + 1 = 3, 2 2 + 1 = 5, 2 4 + 1 = 17, 2 8 + 1 = 25 7, 2 16 + 1 = 65 537, 
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are prime; and, of course, on the fact that 2” + 1 can be prime only if n 
is a power of 2. 

However, he didn’t even know whether the next number, 2 32 + 1 = 
4 294 967 297, was prime or not. And, as you can imagine, he had 
absolutely no idea whether 2 64 + 1 = 18 446 744 073 709 5 51 617 was 
prime. 

Nonetheless, this has become one of Fermat’s most famous con- 
jectures, and as a result numbers of the form 


F„ = 2 2 " + 1 


are called Fermat numbers. This conjecture is famous for several reasons. 
For one thing, it was a conjecture made by Fermat; for another, it is a 
very appealing conjecture with such a simple and attractive pattern. But 
the main reason this conjecture became so famous is that it turned out 
to be wrong! 

Not only was it wrong, but it was dramatically wrong in ways that 
show us that the intuition of even the greatest of mathematicians can 
fail rather badly on occasion. The very next number on Fermat’s list, 
F s = 2 32 + 1 , is not prime, although it would be almost a hundred years 
before Euler found a factor of F s = 4 294 967 297. In Problem 5.24, you 
will discover why Fermat himself should have been able to factor this 
number. Why he failed to do so is a mystery. 

That, however, is only the beginning of what goes wrong with 
Fermat’s conjecture. Not only is P's = 2 32 + 1 not prime, but it seems 
likely that the only Fermat numbers that are actually prime are F 0 , F\, 
F2, F3, and F 4 . No others have been found so far. Thus these first five 
Fermat numbers — that is, 3, 5, 17, 257, 65 537 — are sometimes referred 
to as Fermat primes, but one needs to take care with this terminology 
always to remember that these are the only known Fermat primes. 

In 1880, Fortune Landry showed that F 6 could be factored as F 6 = 
2 64 + 1 = 274 177 • 67 280 421 310 721. We don’t know how Landry 
managed to come up with this factorization. However, it seems likely 
that he must have approached this problem using the same basic idea 
that Euler used to factor Fs (see Problem 5.24). Thus Landry was aware 
that a prime divisor p of Fs must have the form p = 128 n + 1. So, 
in principle, all Landry had to do was try primes of this form until he 
found one that was a divisor of 2 64 + 1 . 

In order to save labor, Landry might well have realized that since 
128 • 1 + 1 = 129 is divisible by 3, for every third value of n after 
n = 1 the number 128« + 1 would also be divisible by 3; hence none 
of these numbers needs to be tested as a potential prime divisor of Fs- 
Similarly, 128 • 3 + 1 = 385 is divisible by 5, 7, and 11; therefore, every 
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5th, 7th, and 11th value of n after n — 3 can be ignored. Proceeding 
in this way— 13 is a divisor of 128 • 7 + 1 = 897, 17 is a divisor of 
128 • 15 + 1 = 1921, and so on — eliminates much of the tedious testing 
of potential divisors. For example, using this idea for just the first 50 
numbers of the form p = 128/7 + 1, we discover that p is prime only 
for n = 2, 5, 6, 9, 11. 21, 26, 27, 35, and 39. Hence we can get by with 
just testing 10 numbers instead of 50. Still, the eventual divisor that was 
found by Landry didn’t show up until n = 2142, so this was quite a 
remarkable achievement in this pre-computer era. 

It was to be almost a century before the next Fermat number 
F 7 was factored in 1970, then F 8 (1981), F n (1988), F 9 (1990), 
and Fio (1995). It is now known that all of the Fermat numbers 
from F 5 up to F 32 are composite although it is still the case that 
only F 5 through Fn have been completely factored. In 1996, on 
his eightieth birthday, Richard Guy made a $20 bet with John 
Conway — at the time they were writing The Book of Numbers — that 
another Fermat number would be completely factored within the next 
twenty years. 

Richard Guy got some encouraging news in early 2010 when Michael 
Vang found the sixth known prime factor of F 32 : 

568 6 3 0 647 5 3 5 3 5 6 955 169 033 410 9 4 0 867 804 839 3 6 0 742 060 818 433 . 

However, once you divide F 12 by its six known prime factors, you are 
still left to deal with a composite number having 1133 digits! So John 
Conway is not too worried about his $20 bet. 


Binomial Coefficients 

Our final topic in this chapter on Fermat is binomial coefficients. We 
introduce this extremely useful subject at this point in order to be able 
to discuss how Fermat was able to use binomial coefficients to prove 
a theorem about which he wrote, “there can hardly be found a more 
beautiful or a more general theorem about numbers.” That seems like 
a theorem that might be worth talking about. But first, we need to talk 
about binomial coefficients. 

Diagrams that contained the binomial coefficients — that is, the coeffi- 
cients for binomials such as (a + b) 2 , (a + b) 3 , .. . —appeared in China 
as early as the eleventh century. An elegant diagram showing these 
coefficients arranged in a triangle for all these binomials up through ( a+ 
fi) 8 appeared with the title page of Zhu Shijie’s book The Precious Mirror 
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Figure 5.3 Zhu Shijie's arrangement 
of the binomial coefficients. 


of the Four Elements in 1303. Another of China’s great mathematicians, 
Yang Hui, had included a similar triangular array in his own series of 
works which appeared from 1261 to 1275. 

It is easy to expand a binomial such as ( a + b) 3 using Zhu Shijie’s 
diagram because the third row of the diagram gives us the coefficients 
1,3,3, 1 for this binomial, so we know that 


(a + b) 3 = 1 • a 3 + 3 ■ a 2 b + 3 • ab 2 + 1 • b 3 . 


Note that in this diagram we count the rows starting from the top with 
0 rather than 1, so that the third row will correspond to a binomial of 
degree 3. 

We can even read the numbers in the sixth row without much 
difficulty, and conclude that 

(a + b) 6 = 1 • a 6 + 6 • a 5 b + 15 • a 4 b 2 + 20 • a 3 b 3 + 15 • a 2 b 4 + 6 • ab 5 + 1 • b 6 . 

In the West, this triangular array of numbers is called Pascal's triangle, 
even though it also appeared in Europe four hundred years before Blaise 
Pascal was born. In China, it is known as the Yinghui triangle. In Iran 
it is called the Khayyam triangle. Yet it is still fitting for this array to be 
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named after Pascal because he was the first to explore the properties of 
these numbers systematically. 

In particular, Pascal made good use of the triangle in the study of 
probabilities. In 1654, Pascal received a letter from a friend of his, the 
Chevalier de Mere, a gambler, asking how two players whose game of 
dice has been interrupted should go about dividing the stakes. Since 
this amounts to asking at any given stage of a game how to compute 
the probabilities that each player will eventually win the game, this 
question led Pascal, along with Fermat, to lay the foundations of prob- 
ability theory. It was also in 1654 that Pascal first used the method of 
induction; and he did so to prove a property of the triangular array of 
numbers that now bears his name. 

We define the binomial coefficient ("), which we read as “n choose 
r,” to be the number of ways to choose r elements from a set of n 
elements. For example, there are six ways to choose two elements 
from a set with four elements { a , b , c, d }; namely, you can choose 
(a, b}, { a , c), {a, d), { b , c), {b, d} or [c. d }. Thus (*) = 6. 

To expand a binomial such as 

(a + b ) 4 = (a + b)(a + b){a + b)(a + b ), 

we must multiply each a and b in each term by all the other as and 
bs in all possible ways. So, if we want to know the coefficient of a 4 in 
the expansion of (a + b) 4 , we ask how many ways can we form a 4 , and 
the answer is that there is only one way to do this: by choosing an a 
from each of the four terms in parentheses (which is equivalent to not 
choosing any bs). Similarly, if we want to know the coefficient of a 2 b 2 
in the expansion of ( a + b) 4 , we ask how many ways can we form a 2 b 2 , 
and we answer this by asking how many ways can we choose two bs. For 
example, we could choose the first two bs (and hence the last two as), 
or we could choose the first b and the third b (and then, therefore, the 
second a and the fourth a), and so on. The answer, then, is that there 
are ( 2 ) = 6 ways to form a 2 b 2 . 

Therefore, in this way, we see that the expansion for (a + b) 4 is 
(a + b) 4 = Qa 4 + ( 4 )a 3 b + Qa 2 b 2 + Qab 3 + (\)b 4 , 

that is, (a + b) 4 = a 4 + 4 a 3 b + 6 a 2 b 2 + 4 ab 3 + b 4 . 

Note that one binomial coefficient, ( 4 ), does not seem well defined 
at this point. In this situation its value is quite clear since choosing zero 
bs means you are choosing four as, so (*) = 1, even though it is not 
exactly clear what one might mean in general by saying “there is one 
way to choose zero elements from a set of four elements.” In any event, 
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to remove whatever lingering ambiguity there may be, we simply define 

(o) tobel - 

The following result, which now merely states the connection be- 
tween the coefficients of a binomial (a + b) n and binomial coefficients 
(") , is known as the binomial theorem. 

Theorem 5.5 (the binomial theorem). 

(a + bf = (”K + Qa n ~'b + ( n 2 )a"- 2 b 2 + • • • + Qb". 

This theorem wouldn’t do us much good if we didn’t have an effective 
way to compute binomial coefficients. There are two ways to do this. 
The first is to use Pascal’s triangle (Figure 5.4), but this method is circular 
unless we have a way to generate Pascal’s triangle other than to say it is 
just a triangular arrangement of the binomial coefficients. 

As you probably know, the way to generate Pascal’s triangle is to 
observe that you have Is going down the left and right sides of the 
triangle, and otherwise each number in the triangle is the sum of the 
two numbers immediately above it. So 10 = 4 + 6, 56 = 21 + 35, and so 
on. 

It is very easy to see why this works. For example, in the expansion 
for (a + b) 7 , we have 


(a + b) 7 = a 7 + ■ ■ ■ + 21 a s b 2 + 35 a 4 b 3 + • • • + b 7 , 


and there are now 56 = 21 + 35 ways to get an a 5 £> 3 term in the 
expansion for ( a + b) 8 . This is because if we compute (a + b)(a + b ) 7 , 
the only choices are to multiply b times 21 a s b 2 , or to multiply a times 
35 a 4 b 3 , and so we get 56 a s b 3 . 

This recursive property of the binomial coefficient — and of Pascal’s 
triangle — is important enough that it is worth recording as a theorem. 

Theorem 5.6. 



n— 1 
r - 1 


+ 




Proof 

For this proof, we want to rely only on our definition of the binomial 
coefficient (") as the number of ways to choose r elements from a set of 
n elements. 

Consider a set S with n elements. Let x e S be a specific element in 5. 
As we choose r elements from the set 5, two possible things can happen: 
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1 


1 1 


1 2 1 


13 3 1 


1 4 6 4 1 


1 5 10 10 5 1 


1 6 15 20 15 6 1 


1 7 21 35 35 21 7 1 


1 8 28 56 70 56 28 8 1 


Figure 5.4 Pascal's triangle. 


either x will be one of the r elements we choose, or x won’t be one of the 
r elements we choose. 

If x is one of the r elements, then we have effectively chosen r — 1 
elements from the other n- 1 elements in 5, and there are (”“j) ways to 
do this. 

If x is not one of the r elements, then we have chosen r elements from 
the other n - 1 elements in S, and there are ("f 1 ) ways this can be done. 

Therefore, (") = ("~J) + (”~ 1 ), as desired. This completes the proof. ■ 

The second method for computing binomial coefficients is to use 
an explicit formula for ("). This formula uses the factorial notation 
mentioned in Problem 2.16. The expression kl is read “k factorial” and 
represents k ■ {k - 1) • (k - 2) • • • 2 1. Thus, 4! = 4 • 3 ■ 2 ■ 1 = 24. 
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Let us use a specific example to see the main idea here. Suppose we 
have a set with seven letters,' {a, b, c, d, e, f, #}. How many different 
ways can we permute three letters from this set? That is, how many 
different permutations can we form such as abc , acf, ceg, bca , . . . , 
where order makes a difference. (Think of this as a horse race; we are 
asking how many different ways these seven horses can finish: first, 
second, and third.) 

The answer is just 7-6-5, because there are seven choices for the first 
letter, then six choices left for the second letter, and finally five choices 
for the third letter. Note that we can also write this answer as since 

7! _ 7-6-5-4-3-2-1 _ 7 r c 
4! ~ 4-3-21 — / ' D ’ J- 

Now, using this idea, we can compute Q, because when we per- 
muted three letters, each three-letter combination such as (a, b, c } 
resulted in six different permutations: abc, acb, bac, bca, cab, cba; and 
each permutation is merely a different way of “choosing” these three 
letters. 

Therefore, 


H\ 7-6-5 7-6-5 7! 7! 

\3/ ~~ 6 “ 3! “ 3!~4! ~ 3! (7-3)! 

This formula for binomial coefficients is important enough that it 
too is certainly worth recording as a theorem. 

Theorem 5.7. 


n\ n\ 

rj r\(n — r)V 


Proof 

To choose r elements from a set of n elements, we first permute the r 
elements. There are n ways to choose the first element, n — 1 ways to 
choose the second element, and so on. So this can be done in a total of 
n(n - 1 )(n - 2) • ■ • (n - r + 1) ways. Note that 


n(n - 1 )(n — 2) • • • {n — r + 1) 


nl 

(n - r)! ' 


But each set of r elements has now been counted exactly r! times, 
because r elements can be arranged in r! = r(r — l)(r — 2) ■ • ■ 2-1 
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different orders. Therefore, 

/ n\ nl 

\r J r\(n — r)V 


This completes the proof. ■ 

Fermat made use of an interesting— and, once you see it, obvious- 
property of binomial coefficients to prove his little theorem for the case 
a -2. The obvious pattern in Pascal's triangle that he used is that the 
sum of the numbers in any row always equals a power of 2. Starting with 
1 = 2°, the first row is 1 + 1 = 2 1 , the second row is 1 + 2 + 1 = 2 2 , the 
third row is 1 + 3 + 3 + 1 = 2 3 , and so on; the sum of the numbers in the 
nth row is always 2". 

There is a good reason why this obvious property holds true. The 
numbers in the nth row, ("), (”), represent the number 

of ways of choosing 0 elements, plus the number of ways of choosing 
1 element, plus the number of ways of choosing 2 elements, plus the 
number of ways of choosing 3 elements, and so on, until you get to the 
number of ways of choosing n elements, all from a set with n elements; 
in other words, the sum of these numbers represents the total number 
ways of choosing subsets from a set with n elements. But a set with n 
elements has a total of 2" subsets, since each element is either in or not 
in a given subset (think of flipping a coin for each element to decide 
whether it is to be in a given subset: for heads it is included in the subset, 
for tails it is not included in the subset; and so there are 2” possible 
outcomes). Hence 



By 1636, Fermat had proved another property of Pascal's triangle: for 
any prime p, the number p divides every number in the p th row in 
Pascal’s triangle, except of course the two Is on either end. You 
will be asked to prove this property in Problem 5.40. So, for exam- 
ple, in the seventh row, we see that 7 divides each of the numbers: 
7, 21, 35. 35, 21, 7. 

In about June 1640, Fermat wrote toMersenne saying, “Here are three 
propositions I have found on which I hope to erect a great building.” 
The second of these propositions was the statement that when p is an 
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odd prime, then p must divide 2 P l — 1. Thus this second proposition 
was nothing less than Fermat's little theorem for the case a = 2. 

Here is how he proved that particular case of his little theorem in his 
letter to Mersenne. First write out 2 p as follows: 


2 p = 1 + 



+ 



+ 1 . 


Now, since p divides each of the binomial coefficients in this expres- 
sion, it must also divide 2 p - 2; hence p | ( 2 p ~ x - 1). In Problem 5.40, you 
will see how this argument can be used to prove Fermat’s little theorem 
in general. 

Fermat also used binomial coefficients to prove an amazing theorem 
for finding formulas for sums of the form 


r+2 r +3 ' + ... + rf. 


This is the theorem about which he wrote, “there can hardly be found a 
more beautiful or a more general theorem about numbers.” 

For example, we are already familiar with the first two of these 
formulas (see Problems 2.1 and 1.12), for the values r = 1, 2: 

1+2+3+- • - + n = n(n ^ 1} ; i 2 + 2 2 + 3 2 + - • - + n 2 = n(n+ 1)(2n + 1) . 

Fermat’s idea was to recursively make use of one of these formulas to 
find the next formula. So he used the above formula for l 2 + 2 2 + 3 2 + 
• ■ • + « 2 to find a formula for l 3 + 2 3 + 3 3 + ■ ■ • + n 3 . Then he used the 
formula for l 3 +2 3 +3 3 +- • -+n 3 to find a formula for l 4 +2 4 +3 4 +- • • +n 4 . 
In this way, he could find a formula for l r + 2 r + 3 r + • • • + //, for any 
value of r. 

In his proof, Fermat used still another property of binomial coeffi- 
cients, a property of Pascal’s triangle that, at least in Minnesota, is called 
the “hockey stick theorem” (see Problem 5.41). This is due to the fact 
that any number in Pascal’s triangle is always the sum of all the numbers 
extending along a diagonal line above the number in a pattern that 
looks remarkably like a hockey stick! For example, 

56 = 1 + 5 + 15 + 35, 


as shown in bold in Figure 5.5 

We can write the hockey stick theorem formally as follows: 


n + r 
r + 1 


£ 

(=i 


/ + r - 1 
r 
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1 


1 1 


1 2 1 

13 3 1 

1 4 6 4 1 

1 5 10 10 5 1 

1 6 15 20 15 6 1 

1 7 21 35 35 21 7 1 

1 8 28 56 70 56 28 8 1 

Figure 5.5 The hockey stick theorem: 56 = 35 + 15 + 5 + 1 . 


Thus, for example, if we let n = r = 4, this says (®) = Ylt=i C 4 3 )> which 
is just (j) = (^) + (^) 4- (Jj) + (4) , that is, 56 = 1 + 5 + 15 + 35. 

Fermat then wrote the term inside this summation as a polynomial 
in terms of the variable i as follows: 


/f + r — 1 \ 0 + t — 1) • • • (i + 1); i r + a\i r ^ + • • • + u r ~\i 

V r ) = 7\ = r\ 

where «i, . . . , a r _ 1 are just the coefficients of this polynomial. 
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Fermat can sum both sides of this equation from 1 to n using the 
“hockey stick” property of Pascal’s triangle to get 


n + r \ v 

r + lj ^ r\ 

7 1 = 1 


n \r - 1 

+ fll E 7T + 

i = 1 


+ Ur ~ 1 E Ty 

i=l 


But 


(n + r\ _ (n + r) ■ ■ ■ n 
\r + lj = (r + 1)! ’ 

so we can multiply the previous equation through by r! to get 


(n + r) ■ ■ ■ n 
r + 1 


E ^ 1 + ' • • + fl r-l 

i=l /=! 


E<- 


And there, finally, sitting on the right side of this equation are the very 
sums that Fermat wanted formulas for: his “beautiful theorem.” 

Let’s see how to use this amazing equation to find these formulas, one 
after the other. The first formula couldn’t be easier. Let r = 1, then the 
equation above becomes 


(n + 1)« 
2 


X> 


which we immediately recognize as the formula for the triangular 
numbers (see Problem 2.1). 

This was the first step in what is now a recursive process for finding 
all of the other formulas. Next, let r = 2. For this case, we will need 
to know the coefficient a\ from the polynomial (i + l)i. So we write 
(i + l)i = i 2 + a\i, and see that, in this case, cji = 1. 

Therefore, with r = 2 the equation above is 


(n + 2 )(n + l)n 
3 


n 


n 


E /2 + «iE 


and, because J2l=i 1 = an d fli = 1, we can rewrite this as 


E' 2 


(n + 2)(n+l)rt (n + l)n 
2 


which, with a little algebra, reduces to the familiar formula for the sum 
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of the first n squares 


£' 2 


n(n + 1)(2 n + 1) 
6 


which you were asked to prove by induction in Problem 1.12. 

One of the main advantages of Fermat’s recursive method for these 
formulas is that his method produces these formulas; whereas, with 
induction, you can use induction to prove a formula only if you already 
have a formula in hand to prove. 

We could now continue in this same way, finding next a formula for 
IX l ' 3 by using the formulas we know for ]T" = i * 2 and XXi U then, 
finding a formula for £" =1 ' 4 , using the formulas for J2l=i * 3 , E"=i i z > 
and Ya=\ i> and so on, as long as we like (see Problems 5.44 and 5.46). 


"Multi Pertransibunt et Augebitur Scientia" 

Fermat must have sensed that his work in number theory might not 
be carried on by others, and in fact this indeed turned out to be the 
case. It would not be until another great mathematician, Euler, picked 
up Fermat's enthusiasm for numbers, nearly a century later, that this 
particular branch of mathematics would again flourish. 

In 1659, and less than six years before his death, Fermat wrote a 
melancholy last letter to Huygens quoting the philosopher and states- 
man Francis Bacon in the process of passing the torch forward to the 
future: 

Perhaps posterity will be grateful to me for having shown that the 
ancients did not know everything, and this account may come to 
be regarded by my successors as the passing of the torch, in the 
words of the great chancellor of England, following whose 
intention and motto I shall add, 

many shall pass away, knowledge will grow. 


Problems 

5.1 * Verify the fundamental identity of Fibonacci from Liber quadratorum: 

( a 2 + b 2 )(c 2 + d 2 ) = ( ac ± bd) 2 + (ad be) 2 . 

5.2 * Use the example of Diophantus in which 65 is written as a sum of two 

squares in two ways— that is, 65 = 8 2 + l 2 and 65 = 4 2 + 7 2 — and the 
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fundamental identity of Fibonacci to write 325 as a sum of two squares 
in three different ways. 

5.3 (S) When m = a 2 + b 2 and n = c 2 + d 2 are two integers, each of which is a 
sum of two squares, we can use the fundamental identity of Fibonacci 
on page 118 to write their product mn as a sum of two squares. 

Show that we can accomplish the same thing another way using the 
property of complex numbers that the absolute value of a product of 
two complex numbers is equal to the product of their absolute values, 
that is, 


\(a + bi)(c + di)\ = \(a + bi)\ • |(c + dz)|» 

where the a bsolute v alue of a complex number a + bi is defined to be 
\a + bi\ — Vu 2 + b 2 . The absolute value of a complex number 
represents the distance in the plane between a point ( a , b) and the 
origin. Note that when b = 0 the point (a, b) is on the x-axis and the 
absolute value of the complex number corresponds to the absolute 
value of the real number a. The absolute value of a complex number is 
also sometimes called the modulus, or even the length. 

Then, use this method with complex numbers to write 65 as a sum of 
two squares knowing that 65 = 5 • 13 and that 5 = l 2 + 2 2 and 
13 = 2 2 + 3 2 . 

5.4 (H,S) Solve the following problem from Fibonacci's Liber quadratorum: 

find a number such that if 5 is either added to or subtracted from 
the square of the number, then the result in each case will be a 
square. 

5.5 (H,S) Show that there is exactly one right triangle whose legs are integers 

and whose hypotenuse is 89. Find the other two sides of this triangle. 

5.6 (S) In Problem 1.4, you answered the question: which integers can be the 

leg of a primitive Pythagorean triangle? In this problem we describe 
which integers can be the hypotenuse of a primitive Pythagorean 
triangle. 

(a) Use Theorem 1.1 and Theorem 5.3 to show that if the integer n is 
the hypotenuse of a primitive Pythagorean triangle, then n is a 
product of primes of the form 4k + 1 . 

(b) Illustrate that the converse of part (a) is true— that is, if an 
integer n is a product of primes of the form 4k + 1 , then n is the 
hypotenuse of a primitive Pythagorean triangle— by using 
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Theorem 1.1 and the fundamental identity of Fibonacci on page 
118 to find a primitive Pythagorean triangle whose hypotenuse is 
n = 1625 = 5 3 • 13. 

Note that each time you use the fundamental identity of Fibonacci, 
one of the two squares produced will be even and one will be odd, as 
required in order to apply Theorem 1.1, and that for at least one choice 
of ± and the two squares will be relatively prime, which is also 
required. It is not difficult to verify that this happens in general. 

5.7 * Consider the geometric progression 

2, 2 2 = 4, 2 3 = 8 . 2 4 = 16, 2 s = 32, 2 6 = 64, 

and the prime number 19. Compute the remainder for each of the 
eighteen numbers 2, 2 2 , 2 3 , . . . , 2 18 when divided by 19. Which of the 
possible remainders from 0 to 18 do you get? 

5.8 (S) Use Fermat’s example from his letter to Frenicle, the geometric 

progression 

1, 3, 3 2 = 9, 3 3 = 27, 3 4 = 81, 3 s = 243, 3 6 = 729 

with the prime number 13, to illustrate the partitioning of the twelve 
remainders 1, 2, 3, ... , 12 by taking the first three terms 1, 3, 3 2 of 
the progression as the partitioning set {1, 3, 9). 

5.9 * (H, S) Is there a good way to test a number to determine whether it is 

prime? Around twenty-five hundred years ago Chinese 
mathematicians thought they had an answer: n is prime if and only if 
n | (2" - 2). Fermat’s little theorem shows that they were half right: if n is 
prime, then n|(2"-2). But in the other direction this conjecture is false. 

Nonetheless, this “Chinese hypothesis” is in fact quite impressive; 
for example, for n from 2 to 100 it will correctly pick out all the primes 
and all the composite numbers. Pick three composite numbers, and 
three prime numbers at random less than 30 and, using a calculator, 
confirm that this test correctly determines which are composite and 
which are prime. 

The smallest number for which this test fails is 341. Prove the test 
fails for 341 by showing that 341 1 (2 341 - 2) even though 341 is 
composite. This was not discovered until 1819. 

5.10 (S) In the example done on pages 126-27, we illustrated the descent 

process by showing how a sum of two squares 34 2 + l 2 can be replaced 
by a smaller sum of two squares 8 2 + l 2 . The remainder 8 was the result 
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of using the division algorithm: 34 = 2 • 13 + 8. The key step, then, in 
that example was replacing 34-by 8 in 34 2 + l 2 to get 8 2 + l 2 , and 13 
still divided 8 2 + l 2 . 

However, in the proof of Theorem 5.1, this descent step was 
achieved by finding a smaller sum of two squares to be r 2 + s 2 , where r 
and s had the additional property that r, s < | . This extra fact about 
the size of these remainders was very important later in the proof. 

Note that in the example done on pages 126-27, a = 8 does not have 
this extra property. It is true that 13 18 2 + l 2 = 65, but 8 yt In this 
problem we want to see how the example can be done following the 
proof of Theorem 5.1 faithfully. This is a simple matter of being careful, 
when dividing 34 by 13, to choose a remainder that is smaller than 

So, instead, this time we write 34 = 3 • 13 - 5, and use -5 as a 
remainder. Note that 13|(-5) 2 + l 2 = 26. We could also have come up 
with -5 just by thinking of 8 = -5 (mod 13). 

Continue now the process of writing 89 as a sum of two squares, 
replacing 34, not by 8, but by -5. (You may, if you prefer, replace 34 by 
5, since (-5) 2 + l 2 = (5) 2 + l 2 .) 

5.11 (S) The proof of Theorem 5.1 follows as closely as possible a reconstruction 

of Fermat’s own thinking, and in particular the proof of Lemma 1 
included a contradiction at the point where for every value of x in the 
interval l<x<4n = p-l, the congruence x Zn = 1 (mod p) was 
satisfied. Support this contradiction— as it was likely Fermat himself 
did in numerous cases— with the example p = 13, and find exactly for 
which values of x in the interval \<x<4n = p- \ the congruence 
x Zn = 1 (mod 13) is satisfied. 

5.12 In the proof of Theorem 5.1, at the point where we wish to divide both 
sides of 


k 2 mp — (ar + bs) z + {as - br) 2 

by k 2 , we need to know that k\{ar + bs ) and k\{as -br). Show this 
follows from the fact that k \{a 2 + b 2 ). 

5.13 (H, S) Theorem 5.1, Fermat's fundamental theorem on right triangles, says 

each prime of the form 4n + 1 has a unique representation as a sum of 
two squares. Illustrate this by giving a general argument that 
89 = 5 2 + 8 2 is the only way to write 89 as a sum of two squares. 

5.14 * (H, S) Prove Fermat’s result: for n > 1, the number a n - 1 can be prime 

only if a = 2, and if n is prime. 
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5.15 (H) We are so used to thinking of numbers in base 10 that it is easy to 

miss arguments that use other bases. 

(a) Use the binary number system to prove that 2" - 1 is not prime if 
n is not prime. 

(b) In 1494, Luca Pacioli claimed that 1 + 2 + 2 2 + 2 3 + ■ • • + 2 26 is 
prime. Show this is false by using binary numbers to find a 
divisor of this number. 

5.16 * Since all of the numbers 2 2 - 1,2 3 - 1,2 s - 1,2 7 - 1 are prime, it 

would be natural to conjecture that 2 p — 1 is prime whenever p is 
prime. (This is the converse of Fermat’s result in Problem 5.14.) 

Show that 2 11 — 1 is composite. This particular counterexample to 
this conjecture wasn’t discovered until the sixteenth century. Thus 
Mersenne was correct to leave 11 off of his famous list of primes p for 
which 2 p - 1 is prime. 

The next prime missing from Mersenne’s list is 23. Show that he was 
also correct about 23. 

5.17 (H,S) Prove there are infinitely many primes of the form p = 4n + 1 by 

showing that if {3, 5,7,..., p k } is the set of odd primes up to a prime 
pk, then the number 

N = 2 2 + (3 • 5 ■ 7 • ■ ■ p k ) 2 

must have a prime divisor of the form p = 4n + 1 that is greater than p k . 

5.18 ★ (H) In the proof of Theorem 5.4, we made use of the fact that 

1 + 2 + 2 2 + • • ■ + 2 n ~ 1 = 2"-l. 

Prove that the sum of any finite geometric series, where r / 1, is 
given by 


- 2 „ r" +1 - 1 

l+r+r 2 + -- - + r= . 

r - 1 

5.19 (S) In 1557, the British physician and mathematician Robert Recorde 
discovered that the sum of the proper divisors of 120 is twice the 
number itself. In 1631, Mersenne asked Descartes to find another such 
number, but it was Fermat who found the next number with this same 
property, 672. Verify that both these numbers have this property. 
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5.20 (H,S) Prove that the sum of the proper divisors of the number 

2 36 ■ 3 8 • 5 s ■ 7 7 • 11 • 13 2 • 19 • 31 2 ■ 43 • 61 ■ 83 • 223 • 331 
■ 379 ■ 601 ■ 757 ■ 1201 • 7019 • 112 303 • 898 423 • 616 318 177, 

presented here as a product of its prime factors, is five times the number 
itself. 

5.21 * It can be tedious to check that a pair of numbers is an amicable pair. 

Choose one of the following pairs, and confirm directly that the pair is 
indeed an amicable pair. 

(a) Fermat’s pair: 17 296 = 2 4 • 23 • 47 and 18 416 = 2 4 ■ 1151; 

(b) 1184 and 1210, a pair discovered in 1866 by a sixteen-year-old 
boy! 

5.22 (S) Imitate the proof of Theorem 5.4 to prove Thabit’s formula for 

producing amicable pairs. This is the formula on page 137 that was 
rediscovered by Descartes in 1638. 

5.23 (H,S) If a ± 1 , and a and k are positive integers such that a k + 1 is prime, 

then a must be even (why?). Fermat showed also that k must be a power 
of 2. Prove this. 

5.24 Here is why it is so surprising that Fermat never factored the Fermat 
number F 5 = 2 32 + 1 in order to settle his own conjecture. Since 

2 64 - 1 = (2 32 - 1)(2 32 + 1), it follows that if a prime p divides 2 32 + 1, 
then it also divides 2 64 - 1 . We can also see that if an odd prime p 
divides 2 32 + 1, then p does not divide 2 32 - 1. This means, by 
Corollary 1 of Fermat’s little theorem, that 64 | (p - 1); in other words, 
p is a prime of the form 64/7 + 1 . So all Fermat had to do was try primes 
of this form until he found one that was a factor of 2 32 + 1. 

Do what Fermat should have done in the first place, and find a prime 
factor of 2 32 + 1 = 4 294 967 297. You will be following in Euler’s 
footsteps; this is the way he found a factor a hundred years after 
Fermat, and you will find exactly the same factor Euler found. We’ll 
never know why Fermat failed to do this; it seems to be a complete 
mystery since this is the way he factored 2 37 - 1; perhaps he did try this 
approach, and simply made an error in his computation when he got 
to the prime factor. Who knows? 

You are strongly urged you to do the calculations by hand in this 
problem, partly to get an appreciation for the fact that in Fermat’s day 
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a number such as 4 294 967 297 was a pretty big number to deal with, 
but mostly because it’s the best way to experience the kind of 
excitement that Leonhard Euler surely must have felt when he 
succeeded in factoring this number for the very first time. 

5.25 (H) It turns out that F s = 2 32 + 1 has exactly two prime factors, that is, 
2 32 + 1 = pq, where p and q are primes. Therefore, by finding a prime 
divisor of 2 32 + 1, Euler had succeeded in completely factoring this 
number (and you accomplished the same thing in Problem 5.24). 
Without using a calculator or computer, find the complete 
factorization of 2 64 - 1 . 


5.26 (S) You may have noticed that for n = 2, 3, 4, 5, and 6, the Fermat 
numbers all end in the digit 7: 

17, 257, 65 537, 4 294 967 297, 18 446 744 073 709 551 617. 


Use induction to prove that this is true for all n> 2. 


5.27 The Fermat numbers are pairwise relatively prime; that is, the greatest 
common divisor of any two Fermat numbers is 1. We will illustrate this 
by proving that gcd(2 4 + 1,2 16 + 1) = 1. 

Note that if we can show that 2 4 + 1 | (2 16 + 1) - 2, we’ll be done, 
since then any common divisor of 2 4 + 1 and 2 16 + 1 will also divide 
(2 16 + 1) - 2, and hence divide 2. But both 2 4 + 1 and 2 16 + 1 are odd, 
so the only common divisor is 1. 

So, now we write 


(216 + 1) -2 2 16 -1 
2 4 + 1 ~ 2 4 + 1 


( 2 4 + 1 )( 2 12 - 2 8 + 2 4 — 1 ) 
2 4 + 1 


= 2 12 - 2 8 + 2 4 - 1 , 


which is an integer. Therefore, 2 4 + 1 | (2 16 + 1) - 2. 

Prove this fact about Fermat numbers in general; that is, prove that if 
m and n are two integers such that 0 < m < n, then 


gcd(2 2m + 1, 2 2 ” + 1) = 1. 


5.28 Use the fact that the Fermat numbers are pairwise relatively prime to 
give another proof of Euclid's famous theorem that there are infinitely 
many primes (Theorem 2.1). 


5.29 (H, S) The sequence of Fermat numbers 3, 5, 17, 257 . . . has a very nice 
pattern in that the product of the first two numbers, 3 and 5, is 2 less 
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than the third number, and the product of the first three numbers, 

3 • 5 • 17 = 255, is 2 less than the fourth number, and so on. In other 
words, 


FqF\F2 ■ ■ ■ F „ — F n+ 1 - 2, 


for all n > 0. 

(a) Use induction to prove this formula. 

(b) Give another proof of this formula by simplifying the expression 

F () FiF 2 • • • F n = 1 • FoFtFz ...F„ 

= (2 Z °- 1) (2 2 °+ 1) (2 2 ’ + 1) (2 Z '+ 1) • • • (2 2 "+l) 

using the algebraic identity ( a — 1 )(a + 1) = a 2 - 1. 

(c) Use this formula to give an alternate proof of the result in 
Problem 5.27 that the Fermat numbers are pairwise relatively 
prime. 

5.30 (H) Fermat wrote to Frenicle in 1640 claiming that his formula 2 2 " + 1 

produces an inhnite list of primes. But, so far, his list has produced only 
five primes. Nonetheless, the goal still remains: find a formula that 
produces prime numbers. 

(a) Euler noticed that the formula 

n 2 — n + 41 

produces prime numbers for each value of n = 1, 2, 3, . . . , 40 
(obviously, when n = 41 it produces a number that is divisible 
by 41). 

Show that the polynomials n 2 — n+ 11 and n 2 - n + 17 share 
with Euler's polynomial n 2 - n + 41 the property of being of the 
form n 2 - n + k and producing primes for each value of 
n= 1, 2, 3 k- 1. 

(b) If we replace n by n + 1 in the formula n z - n + 41 we get a 
formula n 2 + n + 41 that produces prime numbers for each value 
of n = 0, 1, 2, 3, ... , 39. Explain why the formula 

n 2 - 81 «+ 1681 

produces prime numbers for each value of n= 1. 2, 3, .... 80. 

(c) Here is another list of numbers that, like the Fermat numbers, 
start out looking tantalizingly as if they will produce nothing 
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but prime numbers: 

31, 331, 3331, 33 331, 333 331, 3 333 331, 33 333 331 


Find the first number in this list that is not prime. 

5.31 (H) Somewhat surprisingly when we ask whether an integer N is a sum of 

two squares it makes no difference whether we mean “integer squares” 
or “rational squares.” In other words, if an integer N is a sum of two 
rational squares, then it is also a sum of two integer squares. Fermat 
knew this but never gave any proof. Prove this fact using Corollary 3 to 
Theorem 5.3. 

5.32 (H,S) It may have occurred to you that while Fermat and others spent a lot 

of time worrying about which numbers were a sum of two squares, we 
have not mentioned the obvious companion question: which positive 
integers can be written as a difference of two squares? 

Find, and prove, the answer to this question. 

5.33 * (S) Fermat developed an effective method for factoring large numbers 

based on the difference of squares. His idea was to try to write a large 
number N as N = x 2 - y 2 , a difference of two squares, from which it 
follows that N factors as (x - y)(x + y). 

Since he is therefore trying to solve y 2 = x 2 - N, he takes the largest 
integer n less than fN, that is, he lets n = [VN] , and then looks for a 
square in the sequence 


(n + l) 2 -N, (n + 2 ) 2 — N, (n + 3) 2 - N, . . . . 


That’s all there is to Fermat’s method, and this sequence is especially 
easy to write down because after you write the first term, the next is 
found by adding 2 (n + 1) + 1, and then the difference between 
successive terms simply increases by 2 at each step. 

Here is an example using a small number N = 203 to illustrate the 
method. In this case n = |_V203J = 14, so we begin the sequence by 
computing 15 2 - 203 = 22. For the next term, we add 2 • 15 + 1 = 31, 
and get 53. Then, we add 33 to get 86, and next add 35 to get 121, at 
which point we have found a square! And so 18 2 - 203 = ll 2 , and we 
have our factorization, since 203 = (18 - 11)(18 + 11) = 7 • 29. 

Use Fermat’s method to factor N = 2 027 651 281. This is the number 
Fermat himself used to illustrate his method. 
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5.34 (H,S) Recall from Problem 2.12 that the pentagonal numbers form the 

sequence 1, 5, 12, 22, 35, 51, 70, 92 A number such as 18 can be 

written as the sum of three pentagonal numbers: 18 = 12 + 5 + 1. 
However, it is clear that 9 cannot be expressed as a sum of fewer than 
five pentagonal numbers: 9 = 5 + 1 + 1 + 14-1. 

Show that 89 also requires five pentagonal numbers. Find four other 
numbers between 9 and 89 that require five pentagonal numbers. 

These six numbers are the only positive integers that require five 
pentagonal numbers. 

5.35 (S) There is an error in Zhu Shijie’s beautiful diagram of the binomial 

coefficients. Find it. 

5.36 ★ The formula for the binomial coefficient (”) given in Theorem 5.7 is 

frequently used as its definition. Use this formula to prove the recursive 
property of binomial coefficients given in Theorem 5.6 

5.37 * (H) Use the binomial theorem, Theorem 5.5, to prove the property of 

Pascal’s triangle that the sum of the numbers in any row always equals 
a power of 2. That is, prove that 



A similar property of Pascal’s triangle is that the alternating sum of 
the numbers in any row always equals 0. This is obviously true for the 
odd-numbered rows: for example, 1-5 + 10 -10 + 5-1 = 0, because 
of the symmetry of these rows, but it is not at all obvious for the 
even-numbered rows: for example, 1-6 + 15 -20 +15 - 6 + 1=0. 

Use the binomial theorem to prove that Pascal's triangle also has 
this property. That is, prove that 



5.38 (S) (a) Use the definition of the binomial coefficient (") to explain the 

obvious symmetry in Pascal’s triangle: (") = ( n " r ) . 

(b) It is clear that the numbers in each row of Pascal’s triangle 
increase until the middle of the row and then decrease 
symmetrically. Verify this property. 
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5.39 (H,S) Use induction to prove that 


l z + 3 2 + 5 2 + ■ 


+ n 2 = 


n + 2 
3 


and that 


2 2 + 4 2 + 6 2 + ■■ - + n 2 



5.40 (H,S) A standard proof of Fermat’s little theorem, in the form a? = a 

(mod p), uses induction. We will outline that proof and let you fill in 
the details. 

The idea is to use induction to prove that, for a given prime p, the 
congruence is true for all positive integers a. You begin with the integer 
a = 1. Obviously, 1^ = 1 (mod p), so the base case for the induction 
holds. 

Next comes the inductive step. We assume that the congruence is 
true for a positive integer a; then, we must prove that it is also true for 
the next integer a + 1 . 

At this point, the binomial theorem is just what we need, because 



Now, it turns out that all of the binomial coefficients in the middle of 
this expression are divisible by p (this detail needs to be filled in; this 
fact was also used by Fermat to prove the case a = 2). So, 


(a + l) p = a p + 1 = a + 1 (mod p), 

by the induction hypothesis. Therefore, the congruence is true for the 
integer a + 1. 

Complete the following details of this proof: 

(a) Prove that p \ (?) for 1 < r < p - 1 . 

(b) The induction proof above shows that the congruence is true for 
all a > 1. Show that it is also true for all a < 0. 

5.41 (S) Use Theorem 5.6 to prove that 


n + r 
r + 1 


= £ 

i = 1 


i+r 

r 
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It was a member of the Colorado College hockey team who once told 
me that where he came from in Minnesota this property of Pascal’s 
triangle is known as the hockey stick theorem. 

5.42 There are endless patterns to be found in Pascal’s triangle. For example, 

you can see the triangular numbers 1. 3, 6, 10. 15, 21, 28, . . . running 
down along a diagonal line in the triangle. And, sitting right next to 
them, running down along a parallel diagonal line, are the tetrahedral 
numbers 1, 4, 10, 20, 35, 56 

Show that Pascal’s triangle even “contains” the Fibonacci numbers 
1, 1, 2, 3, 5, 8, 13, 21 . . . by drawing a series of parallel diagonal lines 
that pass through the triangle at roughly 30 degrees such that the 
Fibonacci numbers appear as the sums of the numbers on the 
individual lines. 

5.43 (H) Here is a pattern that was recently discovered in Pascal's triangle. If 

you place a Star of David — that is, two overlapping equilateral 
triangles — over a number in Pascal’s triangle, the product of the three 
numbers at the vertices of each triangle is the same. For example, you 
can place a small Star of David over the number 35 in Pascal’s triangle 
to touch the six immediately surrounding numbers, and the two 
products are 15 • 35 • 56 = 29 400 and 20 -70 -21 = 29 400. 

Prove that Pascal’s triangle has this Star of David property. (In fact, 
this Star of David can be expanded and the property still holds. For 
example, the star in Figure 5.6 could be expanded to a larger Star of 
David, still centered on 35, that has 5, 10. 21, 126, 84, 7 as its 
associated vertices, or to an even larger Star of David with vertices 
1, 4, 7, 210, 120, 1. 

A similar Star of David property was discovered in 1972 in which the 
greatest common divisor of the three numbers at the vertices of each 
triangle is the same. Verify this property for the small Star of David 
placed as in Figure 5.6 over the number 35, and also for the same star 
placed over the number 126 in Pascal's triangle. 
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5.44 Use Fermat’s recursive method to derive the following formula for the 
sum of the first n cubes: 

£ j3 = ^(n+l )) 2 

i = 1 

Note that this formula says that the sum of the first n cubes is the 
square of the nth triangular number. This formula was first proved by 
Bachet. 

5.45 Use the formula for the sum of the first n cubes in Problem 5.44 (see 
also Problem 4.11) to give a proof of the result in Problem 2.3 that the 
difference between the squares of any two consecutive triangular 
numbers is always a cube. 

5.46 Here is a recursive method for finding formulas for sums of the form 
IX i * r that is ver y similar to Fermat’s own recursive method, but uses 
the binomial theorem. 

We will illustrate the method by finding the formula for i * 2 - 
First, using the binomial theorem, we write 

(l + l) 3 = l 3 + 3-l 2 + 3- l 1 + l, 

(2 + l) 3 = 2 3 + 3 • 2 2 + 3 • 2 1 + 1, 


{n + l) 3 = rv' + 3 • n 2 + 3 • n 1 + 1. 
Then we add the columns of these equations to get 


n + 1 n n n n 


i=2 i = 1 i=l i = 1 i=l 


The good news is that now almost all of the z' 3 terms can be canceled 
from both sides, leaving us with just 


(n + l) 3 - 1 = 3 i 2 + 3 ^2 i + ^ 1. 


i=l 


i=l 


/=! 
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Now comes the recursive part: we use the fact that we already know 
that Y%= i * = and Xa=i 1 = n ’ 1:0 rewr i te this as 

p )+1 ) 3 _ 1=3 £/2 + 3 ^^ + H. 

i = 1 

Finally, we can solve this for ]T" =1 i 2 , and get the expected formula 


E 1 ' 2 


77(77 + 1 )(277 + 1) 
6 


Use this recursive method to to derive the following formula for the 
sum of the first n fourth powers: 

4 77(77 + 1)(277 + l)(3/7 2 + 377 - 1) 

= 30 ‘ 

5.47 (S) In 1990, D. Zagier published a paper with the title “A One-Sentence 
Proof That Every Prime p = 1 (mod 4) Is a Sum of Two Squares” (see 
The American Mathematical Monthly 97 (1990), 144). The editor 
described this new proof of Fermat’s famous result as “elegant” and 
“extremely short.” Zagier’s proof is certainly elegant, as you will see, 
but the only reason he could state this “proof” in one sentence is that 
he left out all the details. This problem provides the outline of Zagier’s 
proof, and lets you do the work of filling in those details. In particular, 
be looking for the place in the proof where you need to use the 
hypothesis that p = 1 (mod 4). 

For a prime p of the form 477 + 1, we define a set 

5 = {(X, y, z)) 

to be the set of all points (x, y, z) such that x, y, z are positive integer 
solutions to the equation 

x 2 + 4yz = p. 

So, for example, if p = 29, the only possible values for x are x = 1, 3, 5 
since x must be odd. For x = 1 , we see that yz = 7, and there are two 
choices for y and z. Similarly, for x = 3, we have yz = 5, and again there 
are two choices for y and z. But for x = 5, we get yz — 1, and the only 
possibility is y = z = 1. Hence, 



S = 1(1, 1, 7), (1, 7, 1), (3, 1, 5), (3, 5, 1), (5, 1. 1)}. 
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There are two distinct parts to this proof. The first part— where all the 
work is— is to prove that the set S has an odd number of elements. 
Then, the second part— showing how this implies Fermat's famous 
result— is easy. As we begin the first part, here is a small detail to deal 
with immediately. 

(a) Explain why the set S is finite. 

We now partition the set S into three sets, A, B, and C. If (x, y, z) e S, 
clearly y - z <2y. We determine which set (x, y, z) belongs to 
according to how x compares with these two numbers: 

if x < y - z, then (x, y, z) e A; 

if y — z < x < 2 y, then (x, y, z) e B; 

if x > 2 y, then (x, y, z) e C. 

(b) Prove that if (x, y, z) e S, it is never the case that x = y - z or 

x = 2 y. Hence the three sets A, B, and C really do partition the 
set S. 

At this point you might want to think about this situation 
geometrically. The points in S all lie in the curved surface determined 
by the equation x 2 + 4 yz = p. If three-dimensional space is represented 
by a coordinate system where the xy-plane is horizontal and the z-axis 
is vertical, then the set A is just all the points in S that fall below the 
plane x = y - z. The set B is the points in 5 that lie between the plane 
x = y - z and the vertical plane x = 2 y, and the set C is the points in 5 
that lie on the other side of the vertical plane x = 2 y. What you proved 
in part (b) is that neither of these planes intersects the set S. 

In order to prove that the set S has an odd number of elements we 
define a function f : S -* 5, as follows: 

( (x + 2z, z, y - x - z), if (x, y, z) e A; 

f(x) = < (2y - x, y, x - y + z), if (x, y, z) e B ; 

l (x - 2y, x - y + z, y), if (x, y, z) e C. 

The first thing to check about this function is perhaps the most 

amazing thing about it. 

(c) Verify that if (x, y, z) e S, then f(x, y, z) e S. 

We claim that f is a one-to-one correspondence on the set S, and 
that f takes elements in A to C; it takes elements in C to A; and it takes 
elements in B to B. 

(d) Prove that if (x, y, z) e A, then f(x, y, z) e C; and if (x, y, z) e C, 
then f(x, y, z) e A; and if (x, y, z ) e B, then f(x, y, z) e B. 
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(e) Prove that f : S -> S is a one-to-one correspondence by proving 
that the function f is its own inverse. 

At this point, you might want to confirm for yourself that for p = 29 
the partition of S is given by 

A = {(1, 7, 1), (3, 5. 1)}, B = {(1, 1, 7)), C = {(3, 1, 5), (5, 1, 1)}, 

and that the function f does swap the elements in A with those in C, 
and does take the element (1, 1, 7) in B to itself. That is, the element 
(1, 1, 7) is what we call a fixed point, because it is not moved by the 
function f. 

(f) Prove that f : S -»• S has a single fixed point, that is, a single value 
of (x, y, z) such that f(x, y, z) = (x, y, z). 

Since f has a single fixed point, we know that S has an odd number 
of elements because all the other elements can be paired off, each 
element (x, y, z ) together with its image f{x, y, z). We are now finished 
with the function f and the first part of the proof. 

For the second part of the proof, we define a function £ : S —>■ S, as 
follows: 


g(x, y, z) = (X, z, y). 

(g) Verify that if (x, y, z) e S,then^(x, y, z) e S. 

This function is clearly also a one-to-one correspondence on the set 
S and is also its own inverse. It is also clear that for a value of (x, y, z) 
for which y / z, the function g simply switches the values of y and z. 
Hence, in this way, we can pair off distinct elements of S, (x, y, z) and 
(x, z, y), whenever y ^ z. 

However, since S has an odd number of elements, not all of its 
elements can be paired in this way, so there must be some element 
(x, y, z) in S such that y — z. For this element in S, then, since y = z, we 
have 


X 2 + (2 y) 2 = X 2 + 4 yz = p , 

and p is a sum of two squares. This completes the “one-sentence” 
proof. 

(h) For p — 29, note which elements of S can be paired off in this 
way, find an element (x, y, z) in 5 such that y = z, and use this 
element to write 29 as a sum of two squares. 
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The last two chapters presented the beginning of modern number 
theory as being largely the result of the work of a single man, Pierre 
de Fermat, who in turn had been inspired by the much earlier work of 
the ancient Greek mathematician Diophantus. In Chapter 7, we will 
see that number theory, in effect, had to begin anew in the century 
following the death of Fermat, when the great Leonhard Euler picked 
up the torch passed on by Fermat. 

However, in this chapter, we are going to step out of the historical 
flow of our story, as we did briefly in Chapter 3, in order to discuss 
the topic of congruences in somewhat more detail than we did in that 
chapter. In particular, this will allow us to present modern proofs of 
several important results. We begin with the “slick” proof of Fermat’s 
little theorem, as promised in the last chapter. 


Fermat's Little Theorem 

Fermat’s little theorem says that if p is prime, and does not divide a, 
thenu^ 1 = 1 (mod p). In 1806, James Ivory found a very elegant proof 
of this result based on a simple idea that involves congruences. 

Let’s take the prime p = 7, and consider the numbers 1, 2, . . . , p-l; 
that is, 


1, 2, 3, 4, 5, 6. 

Next, let a = 3, and consider the numbers 

a, 2a, 3 a, 4 a, 5 a, 6a; 


that is, 


3, 6, 9, 12, 15, 18, 

but instead, we will write these six numbers modulo 7, and get 


3, 6. 2, 5, 1, 4. 
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It is immediately obvious — and this is the simple idea — that these are 
the same six numbers we began with, just rearranged. 

Here is a proof of Fermat’s little theorem based on this simple idea. 

Proof of Theorem 5.2 (Fermat's little theorem) 

Consider the p-1 elements a, 2a, 3a. ... , (p-l)a. We claim that these 
numbers are congruent, in some order, to the numbers 1, 2, 3, ... , 
p — l. Since p does not divide a, none of these numbers are congruent 
to 0 modulo p, so it is sufficient to show that these p — l numbers are 
distinct modulo p. 

Suppose that two of these numbers are not distinct; that is, for r / s, 
suppose that ra = sa (mod p). Then, by Theorem 3.7, since a and p are 
relatively prime, we can divide by a, and get r = s (mod p). But both r 
and s are integers in the set {1, 2, .... p - 1), so this implies that r = s, 
which is a contradiction. 

Therefore, the numbers 

a, 2a, 3a, ... , (p - l)a 

are, in some order, congruent modulo p to the numbers 

1, 2, 3, . . . , p- 1. 

Thus 

a • 2a ■ 3a • •• (p - l)a = 1 ■ 2 • 3 ■ • • (p — 1) (mod p). 

Now, each of the numbers 2, 3, . . . , p - 1 is relatively prime to p, so we 
can — again by Theorem 3.7 — divide this congruence by each of these 
numbers, which amazingly enough yields exactly what we want: 

a p_1 = 1 (mod p). 

This completes the proof. ■ 

You may be wondering why we didn't use this as the proof for 
Fermat’s little theorem in Chapter 5 since it is so simple. One reason 
is that our approach there was historical, in the sense that we wanted 
to represent as closely as possible the way Fermat thought about his 
theorem. Another, perhaps better, reason is that the proof in Chapter 5 
actually illuminates why the theorem is true; and, in particular, brings 
out the important idea in Corollary 1 that the least positive integer n for 
which a" = 1 (mod p) must divide p—l. 
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The above proof of Fermat’s little theorem, for all its elegance, is 
somewhat less illuminating. We include it in this chapter— and even 
begin the chapter with it — because it is such a wonderful illustration 
of the effectiveness of congruences. You may have been able to sense, 
however, that both arguments are fundamentally the same. 


Linear Congruences 

Almost all of our attention in this chapter will be placed on solving 
various polynomial congruences 

a n x n + a n - + ■ ■ ■ + a\X + a 0 = 0 (mod m), 

and we will especially look for any similarities they may have with 
polynomial equations. When we speak of a solution to a polynomial 
congruence, we mean any integer x that satisfies the congruence, but 
we also are primarily interested in solutions in the set {0, f, 2, . . . , 
m — 1 }. So, for example, if two integers x and y both satisfy a polynomial 
congruence, but x and v are congruent modulo m, we do not consider 
them to be distinct solutions. 

Even relatively simple looking quadratic congruences such as 

x 2 - 1 = 0 (mod p) and x 1 + 1 = 0 (mod p), 

where p is a prime, can be somewhat trickier to solve than their equa- 
tional counterparts: 


x 2 — 1=0 and x 2 + 1 = 0. 

So, in this section we will investigate first the polynomial congru- 
ences that are the simplest, those that are linear. A linear congruence, as 
the name suggests, is a polynomial congruence of the form a\ x + a 0 = 0 
(mod m), although for convenience we will also often write a linear 
congruence in the equivalent form 

ax = b (mod m). 

We could, at this point, appeal directly to Theorem 4.1, and phrase 
this congruence in terms of a linear Diophantine equation in order to 
describe its solutions (see Problem 6.3). However, it will be instructive 
for us to examine this congruence in some detail independent of that 
theorem. 
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Now, if this were an equation: ax = b, we wouldn’t even be talking 
about it, because we would just' solve it by “dividing by a” to get x = |. 
Can we do something similar with the congruence ax = b (mod m)? 

Well, we certainly can in a case such as Sx = 10 (mod 18), because 
by Theorem 3.7 we can divide this congruence by 5. But what about the 
congruence 5x = 11 (mod 18)? This doesn’t look so promising since 
the “answer” y is not an integer. Still, 5 and 18 are relatively prime, so 
the idea of “dividing by 5” is a good idea; we just need to do it a different 
way. 

It turns out that for this special case— that is, where a and m are 
relatively prime— we can use the idea we just saw in the proof of 
Fermat’s little theorem. So let x run through all of the values from 0 
to m- 1, and write the m numbers of the form ax: 


a- 0, a- 1, a • 2, a ■ 3, ... , a{m— 1). 


Just as in the proof of Fermat’s little theorem in the previous section, 
these numbers are all distinct modulo m, since a and m are relatively 
prime. 

Therefore, b must be congruent to one of these m numbers! That is, 
ax = b (mod m) has a solution, and in fact the solution is unique. 

Let’s look at the example: Sx = 11 (mod 18). Letting x run from 0 to 
17 we get the 18 numbers 

0, 5, 10, 15 , 20, 25 , 30. 35 , 40, 45 , 50, 55 , 60, 65 , 70, 75 . 80, 85, 

which we can rewrite modulo 18 as 

0, 5, 10, 15, 2, 7, 12, 17, 4, 9, 14, 1, 6, 11, 16, 3, 8, 13. 

We see that every number 0, 1, 2, . . . , 17 appears exactly once on this 
list, and, in particular, 11 occurred when we took x = 13 to get 65; thus 
x = 13 is the unique solution for this congruence. 

We can now solve a linear congruence ax = b (mod m) when a 
and m are relatively prime; moreover, the solution is unique. Note 
that, since we can solve such a congruence for x, we have in some 
sense— even though | may not be an integer— been able to “divide 
by a.” 

This is a convenient time to introduce some useful terminology. For a 
given modulus m, any set — such as the set 0. 5, 10, ... . 85 above — that 
contains m numbers that represent, modulo m, all the distinct numbers 

0, 1, 2 m — 1 is called a complete system of residues. Note that we 

have already been using the idea that any set consisting of m numbers, 
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no two of which are congruent modulo m, must necessarily be a com- 
plete system of residues. 

Next, let’s consider what happens when a and m are not relatively 
prime. As an example of this situation, consider 6x = 9 (mod 15). Using 
the same strategy as before, we’ll let x run through all the values of the 
complete residue system from 0 to 14, and we write all terms of the form 
6x, getting 

0, 6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66, 72, 78, 84, 
which we rewrite modulo 15 as 

0, 6, 12, 3, 9, 0, 6, 12, 3, 9, 0, 6, 12, 3, 9. 

This time, we don’t get a complete residue system at all. In fact, we 
see that the only values of b for which there could have been a solution 
for 6x = b (mod 15) are b = 0, 3, 6, 9, 12, that is, multiples of 3, the 
greatest common divisor of 6 and 15. Further, we see that the number 
we are looking for, 9, occurs in the list exactly three times, and therefore, 
the congruence 6x = 9 (mod 15) has three solutions x = 4, 9, 14. 

We can glean several very useful general facts from this simple exam- 
ple. The fact that there were only five values that b could have been in 
this case, and that they repeat in the cycle 0, 6, 12, 3, 9, was determined 
by the gcd(6, 15) being 3, because then 6 • 5 = 6(f) = (§) ■ 15, which 
is guaranteed to be 0 modulo 15. In turn, this also meant that once you 
found one solution, such as x = 4, then you could immediately write 
down the other solutions x = 4 + 5 = 9 and x = 9 + 5 = 14. Also, we 
know that this had to be all the solutions, because 14 + 5 = 4 (mod 15). 

We can summarize our conclusions concerning the linear congru- 
ence ax = b (mod m) in the following theorem, first proved by Bachet 
in 1612. You will be asked to prove this theorem directly from Theo- 
rem 4.1 in Problem 6.3. 


Theorem 6.1 . The linear congruence ax = b (mod m) has a solution if and 
only if d\b, where d = gcd(u, in). Moreover ; if x is a solution, then there are 
exactly d solutions 


x, 


V I at v _L 2 m 

x ^ d’ X + d ' 


x + 


3m 
d ’ 


(d—l)m 
d ■ 


There is still the practical matter of how to go about finding a first 
solution to a linear congruence ax = b (mod m) even when we know 
it exists. Once we know one solution, it is easy to write down all d 
solutions. You could imitate the proof of the theorem and simply let 
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x run through the values 0, 1, 2, . . . , m - 1 and compute each ax until 
you found b, and that is not at all a bad strategy, especially for relatively 
small values of m. 

Another approach is to manipulate the congruence until a solution 
can be seen by inspection. For a congruence such as 14a: = 5 (mod 23) 
it is easy to spot the solution a = 2 by inspection, but the congruence 
14a = 11 (mod 23) doesn’t yield an obvious answer. However, if we 
multiply this congruence by 5, we get 70a = 55 (mod 23), which 
reduces modulo 23 to a = 9 (mod 23), so a = 9 is the solution. 

One approach that always produces a solution is to use the idea that 
since d\b, we can use the Euclidean algorithm to write d = ra + sm, and 
then a = r -j is a solution since 

ax - (^ )ar = (%)(ar + sm) = (% )d = b (mod m). 

We can illustrate this approach for the congruence 14a = 24 (mod 45). 
Here Euclid’s algorithm produces 1 = (—16) • 14 + 5 -45, that is, r = -16. 
So a = (-16) • 24 = -384, which we reduce modulo 45 to get a = 21. 

Another idea to deal with a congruence such as 14a = 24 (mod 45) 
is to factor the modulus 45 into relatively prime factors, and then solve 
the two congruences 


14a = 24 (mod 5) and 14a = 24 (mod 9) 

simultaneously. The first congruence reduces to 4a = 4 (mod 5), which 
has an obvious solution a = 1 , and so a = 1,6, 11, 16, 21, . . . , are all 
possible solutions for the original congruence. The second congruence 
reduces to 5a = 6 (mod 9), which has a solution a = 3, and so a = 
3, 12, 21, ... , are the solutions for the second congruence. Since 21 
appears on both lists, a = 21 is the solution to the original congruence. 


Inverses 

We so often use one particular case of a linear congruence that it is 
worth discussing separately. This is the case ax = 1 (mod m), where a 
and m are relatively prime. We know that this congruence has a unique 
solution; that is, there is a unique integer b in the complete residue 
system 0, 1, 2, . . , , m - 1 such that 

ab = 1 (mod m). 

We call the integer b the inverse of a— or, sometimes, the multiplicative 
inverse of a. 
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The reason for the use of the word inverse here is that, in the context 
of this congruence modulo m, the integers a and b are behaving multi- 
plicatively in exactly the same way that, for example, two real numbers 
such as 17 and ~ behave in the equation 17(^) = 1; and we think of the 
fraction or 17 _1 , as being the inverse of 17. 

Consider the congruence 17 x = 1 (mod 23), which has the solution 
x = 19. Therefore, modulo 23, the number 19 is the inverse of 17. 
Note that 19 behaves exactly as the fraction ~ behaves above, since 
17(19) = 1 (mod 23). 

The reason that inverses are so useful in congruences is that they 
allow us to divide in situations where it might not look possible. For ex- 
ample, suppose we wish to solve the congruence 17x 2 = 15 (mod 23). 
We would like to divide by 17, but we can’t, because 17 /15. So, instead, 
we multiply by the inverse of 17, getting 19 • 17x 2 = 19-15 (mod 23), 
which, since 19 • 17 = 1 (mod 23), becomes x 2 = 9 (mod 23). This 
congruence, as it happens, we can immediately solve, getting x = ±3. 
That is, x = 3 and x = 20. (We don’t actually know at this point 
that these are the only solutions, but we will soon have a theorem that 
guarantees this.) 

In other words, as this example shows, we can “divide” by a number 
a by multiplying by its inverse. This is why inverses are so important 
to us in number theory; they allow us to divide. This, too, is why 
the congruence ax = 1 (mod m) is so important, because it, together 
with Theorem 6.1, tells us precisely when an integer a has an inverse 
modulo m. 

Inverses play a key role in the proof of our very next theorem, the 
Chinese remainder theorem. 


The Chinese Remainder Theorem 

We saw earlier that one approach for solving a congruence such as 
14x = 24 (mod 45) is to factor the modulus 45 into relatively prime 
factors, and then solve the two congruences 

14x = 24 (mod 5) and 14x = 24 (mod 9) 

simultaneously. This idea of solving a system of congruences simulta- 
neously is very old. 

In his fourth-century text Master Sun's Arithmetic Manual, the Chi- 
nese mathematician Sun Zi included the following problem: 

There is an unknown number of objects. When counted in threes 
the remainder is 2; when counted in fives the remainder is 3; and 
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when counted in sevens the remainder is 2. How many objects are 
there? 

Nicomachus included this same problem in his Introduction to 
Arithmetic. 

This now famous problem asks for a simultaneous solution for three 
linear congruences: 

x = 2 (mod 3); x = 3 (mod 5); x = 2 (mod 7). 

The first congruence means x = 2, 5, 8, 11, 14, 17, 20, 23, , are 

possible solutions; the second congruence yields x = 3, 8, 13, 18, 

23, 28, 33, ... ; and the third produces x = 2, 9, 16, 23, 30, 37 

Since 23 appears on all three lists, x = 23 is the solution to Sun Zi’s 
problem. Moreover, if you were to extend each of these lists as far as 105, 
you would see that 23 is the only number that is common to all three 
lists, that is, 23 is the unique solution to this system of congruences 
modulo 105. 

Because of the apparent ancient Chinese origin of this problem, the 
following theorem, which guarantees that systems of linear congru- 
ences with relatively prime moduli will always have a solution, is now 
known as the Chinese remainder theorem. This theorem was stated in its 
modern form, and proved, by Euler. 

Theorem 6.2 (the Chinese remainder theorem). Ifni], m 2 . . . . , m k are 

pairwise relatively prime moduli, then the system of linear congruences 

x = a i (mod mi); x = a 2 (mod m 2 ); ... x = a k (mod m k ) 

has a unique solution modulo m = ni\m 2 ■ ■ ■ m k . 


Proof 

For each modulus m, we define an integer M, = ^, that is, 

Mj = ni] m 2 ■ ■ • m ! _im, + im, +2 • ■ ■ m k 

is the product of all the moduli with the single modulus m, omitted. 
Then, since m,- is relatively prime to each of the other moduli, m t is 
also relatively prime to M, . Therefore, by Theorem 6.1 the congruence 
MjX = 1 (mod mj) has a unique solution x = N t . In other words, N, is 
the inverse of Mj modulo m,-, since M, N/ = 1 (mod m,). 
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Now, we claim that 


x — fli Mi N\ + (I2M2N2 + • • • + a^M^Nk 


is the unique solution for the system of congruences modulo m. It is 
clear that x is a solution for each congruence, because for the congru- 
ence * = a,- (mod m,), m, divides each Mj where / ± i simply because 
m, appears in the product for M,-; thus a: s (mod mi), and then, 

by the choice of N, , we have a = a,- M t Nt = a t (mod mi. 

Finally, a must be the unique solution modulo m, because if y is 
another solution to each congruence, then y = a t (mod m,), and so 
y = x (mod mt) for each i. Thus m,|(y - a) for each i. But then, by 
Theorem 3.2, since the m, are pairwise relatively prime, mUy - a), and 
y = x (mod m). This completes the proof. ■ 

Here is a simple thought experiment to help understand how this 
proof is working, and also how cleverly the solution a has been con- 
structed. Imagine you have a set of “modular” glasses, one pair for each 
modulus m,-. So, for example, when you put on your “mod mi” glasses 
and you look at M\ N \ , what you in fact see is just the number 1 ; and, still 
with your mod mi glasses on, when you you look at M 2 N 2 , or any of the 
other Mt N, , what you see is the number 0. Your other mod m, glasses all 
work the same way, each filtering numbers through their own modulus. 
Now, with your mod mi glasses on again, what do you see when you 
look at the expression 


x = a\M\Ni + CI2M2N2 + • • • + akMkNk ? 


All you see is a = a\ ■ f + a 2 ■ 0 + • • • + dk ■ 0; that is, you see x = a\. 
Switching next to your pair of mod m 2 glasses, what you can now see is 
a = a\ - 0 + a 2 ■ 1 +«3 -0 + - ■ - + cik-0; that is, you see a = a 2 . That’s the way 
the proof works, and it even becomes clear that a should be constructed 
in exactly the way it was to be the solution. 

The proof of the Chinese remainder theorem provides us with an 
explicit way to produce the solution for a system of linear congruences. 
We can illustrate this using Sun Zi’s problem 

a = 2 (mod 3); a = 3 (mod 5); a = 2 (mod 7). 

In this case, m = 3 • 5 • 7 = 105, and M x = 5 • 7, M 2 = 3 ■ 7, M 3 = 3 • 5. 

Solving 35a = 1 (mod 3), we first reduce this to 2a = 1 (mod 3), 
and a = 2 is the solution; thus N\ = 2. The next congruence 21a = 
1 (mod 5) has an obvious solution a = 1; thus N 2 = 1. The final 
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congruence 15* = 1 (mod 7) also has an obvious solution x — l; thus 
N 3 = 1 . 

Therefore, the solution is given by 

x = 2-35-2 + 3-21-1 + 2-15-1 = 140 + 63 + 30 = 233 = 23 (mod 105). 

In Master Sun’s Arithmetic Manual, the solution was described as 
follows: 

If you count in threes and have remainder 2, then put 140. If you 
count in fives and have remainder 3, then put 63. If you count in 
sevens and have remainder 2, then put 30. Add these numbers 
and you get 233, from this subtract 210 and you have the 
answer 23. 

Is it any wonder that we call this the Chinese remainder theorem? 
There was even a popular song, “The Song of Master Sun,” that helped 
people remember how to solve this problem: 

Not in every third person is there one aged three score and ten, 

On five plum trees only twenty-one boughs remain, 

The seven learned men meet every fifteen days, 

We get our answer by subtracting one hundred and five over and 
over again. 


Wilson's Theorem 

In 1771, Lagrange gave a proof of a remarkable property of numbers that 
involves congruences, prime numbers, and factorials. This property 
perhaps should have been named for Lagrange, but it is instead called 
Wilson’s theorem, for John Wilson was a student at Cambridge a year 
or two earlier when he noticed— but did not prove— the same property. 
This property had also been conjectured almost a century earlier by 
Leibniz. We will give a marvelous proof of Wilson’s theorem that is due 
to Gauss. 

What Wilson— and Leibniz before him — noticed is that whenever 
p is a prime, then 


(p - 1)! = -1 (mod p ). 
For example, 12! = -1 (mod 13), since 


12! + 1 = 479 001 601 = 13-36 846 277. 
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However, this computation doesn't illuminate much of anything 
about why 12! = — 1 (mod 13), which is why neither John Wilson, who 
presumably did numerous computations of this kind, nor his teacher at 
Cambridge, Edward Waring — whom we will meet again in Chapter 7 — 
could find a proof of this property. But Gauss uncovered the following 
extraordinarily simple pattern. 

If you take the numbers in the expression 12! = 1 • 2 ■ 3 • • • 12, you 
can rearrange these numbers as follows: 

1 12 • (2 ■ 7) - (3 - 9) - (4 • 10) - (5 - 8) - (6 - 11), 

and then look at them modulo 13. The first number in this expression 
is 1, the second number is congruent to - 1 , and the rest of the numbers 
have all been paired off in such a way that each product is congruent to 
1 modulo 13. In other words, each number, other than 1 and -1, has 
been paired with its inverse, so the entire product is just -1. That’s all 
there is to Gauss’s proof. Here is his proof in detail. 

Theorem 6.3 (Wilson's theorem). If p is a prime, then 

(p - 1)! = — 1 (mod p). 


Proof 

The idea of Gauss's proof is to pair off inverses, that is, numbers such 
that ab = 1 (mod p), leaving us with just 1 and - 1 remaining. Thus the 
product of all the numbers will be -1 as claimed. 

We know, because of Theorem 6.1, that each number 1,2 p - 1 

has a unique inverse among the numbers 1, 2 p - 1. Therefore, 

only one thing needs to be checked. We must verify that, except for 1 

and p -1, every number a in the set {1, 2 p- 1} has an inverse bin 

the set such that b a. In other words, we are confirming that we can 
actually “pair off” these integers, by pairing an integer a with its inverse 
b, where a and b are different numbers. 

Suppose that a is a number in the set {1, 2, . . . , p - 1} whose inverse 
is not different from a, but is a itself; that is, a is its own inverse. Thus 
a 2 = 1 (mod p). This means that p\(a 2 - 1), and so p\(a - 1 )(a + 1). 
By Theorem 3.3 (Euclid's lemma), then, p\(a- 1) or p\(a + 1). Therefore, 
« = lora = p— 1. We conclude that 1 and p— 1 are the only numbers in 
the set of numbers {1, 2, 3, ... , p- 1} that pair with themselves. (Note 
that in the event that p = 2, then 1 and p - 1 also happen to be the 
same number.) 

We can now rearrange the numbers in (p - 1)! by pulling 1 and 
p - 1 to the front, and putting the rest of the numbers in inverse pairs. 


176 


Chapter 6 


Schematically, we represent this, as 


(p - 1)! = 1 • (p - 1) • (ai&i) • (a 2 b 2 ) ■ ■ ■ ( a k b k ), 


where each a, -fa,- is an inverse pair, and { a\ , b\, a 2 , b 2 , ■ ■ ■ , a*, is the 
same set of numbers as {2, 3, . . . , p — 2). 

Since each inverse pair is congruent to 1 modulo p, we get 

(p - 1)! = 1 • (p - 1) • (1) • (1) • • • (1) = p - 1 = -1 (mod p). 

This completes the proof. ■ 

The converse to Wilson’s theorem is also true (see Problem 6.7): 

If (n - 1)! = -1 (mod n ) for a positive integer n, then n is prime. 

This, at first glance, looks quite promising as a test for determining 
whether a number is prime, but in reality it is virtually useless because 
the factorial function grows so quickly. Even to test an extremely small 
number such as n = 101 involves a huge number 100! with 158 digits. 


Two Quadratic Congruences 

In this section we will analyze two simple-looking quadratic 
congruences 

x 2 - 1 = 0 (mod p) and x 2 + \ = 0 (mod p), 

where p is a prime. As we mentioned earlier, these are somewhat trickier 
to solve than their equational counterparts 


x 2 - 1 = 0 and x 2 + 1 = 0. 


The first congruence, which we prefer to write as x 2 = 1 (mod p), 
offers no surprises at all. As we saw while proving Wilson’s theorem, 
the only solutions are x = 1 and x — p - 1 (that is, x = -1). So this 
congruence has the same solution, effectively, as the equation x 2 — 1. 
However, when solving this congruence, we did need to exercise some 
care— using Euclid’s lemma — to show that these were the only solutions 
modulo p. 

The second congruence, on the other hand, is much trickier and 
surprisingly hard to deal with. Again, we usually prefer to write this 
congruence as x 2 = -1 (mod p), and so you might suspect it would 
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have no solutions at all, since the corresponding equation x 2 — -1 has 
no real solutions. 

That's the first surprise, since this congruence can have a solution. 
In fact, we already saw one example of this on page 126 where we used 
x — 34 as a solution for the congruence x 2 = -1 (mod 89). We didn’t 
mention this at the time, but it is clear that this congruence also has 
another solution: x = -34, that is, x = 55. 

Still, how did we even know that the congruence x 2 = -1 (mod 89) 
had a solution, and how did we come up with the solution x = 34 in the 
first place? This isn’t so easy — that’s the second surprise. We are going 
to discuss three methods for finding a solution to this congruence. 

1. One method, of course, is simply to try all values of x in the 
complete residue system {0, 1, 2, ... , 88} to see which, if any, 
satisfy the congruence. Even for a small prime such as p = 89, 
this is a tedious approach, but it will get the job done. 

2. Another idea is to use Fermat’s little theorem. To illustrate this, 
we are going to consider the progression 1, 3, 3 2 , 3 3 , . . . (we’ll 
explain why we choose the number 3 shortly). By Fermat’s little 
theorem, we know that 3 88 = 1 (mod 89). Therefore, 3 44 is a 
solution to the congruence x 2 = 1 (mod 89), so we know that 3 44 
is either 1 or -1 modulo 89. If 3 44 = -1 (mod 89), then we have 
found our solution: 3 22 . On the other hand, if 3 44 = 1 (mod 89), 
then we repeat this idea with 3 22 . So one of the numbers 3 11 or 3 22 
should be a solution. 

We begin with 3 11 , and discover that 3 11 = 37 (mod 89). We go 
on to 3 22 , and compute 3 22 = (3 11 ) 2 = 3 7 2 = 34 (mod 89). 

Finally, and this is apparently our last chance, 

3 44 = (3 22 ) 2 = 34 2 = 88 = -1 (mod 89). 

Thus x = 3 22 is a solution. This is clearly a good, fairly quick, way 
to find x = 34. 

However, if we had started with the progression 
1, 2, 2 2 , 2 3 , . . . , instead, this idea wouldn’t have worked out so 
well. Everything would be the same, and we would conclude that 
one of the numbers 2 11 , 2 22 , 2 44 should be a solution. But then, 
when we compute 2 n , we discover that 2 11 = 1 (mod 89). This 
stops us cold. 

What was special about the number 3 in this case? The special 
property that 3 has modulo 89 is that the exponent 88 was the 
first time in the progression 1, 3, 3 2 , 3 3 , . . . , that 3* = 1 
(mod 89), for k > 0. In particular, this means that the 88 numbers 
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in the set {1, 3, 3 2 , 3 3 , . . * , 3 87 } are congruent, in some order, to 
the numbers 1,2 88. 

Euler was the first person to suggest that for any prime p there 
is such a number, and for a given prime p he called such a 
number a primitive root. The notion of a primitive root is 
extremely useful, as we have just seen, and we will discuss 
primitive roots at length in the next chapter. In the meantime, 
since we can’t yet know that all primes have a primitive root, we 
will try a third idea to solve x 2 = — 1 (mod p) in 
general. 

3. We will use p = 13 to illustrate this idea, which is due to 
Lagrange. We know by Wilson’s theorem that 12! = -1 
(mod 13), and so we write 

-1 = 12! = 1-2-3 -4- 5 -6- 7-8 -9- 10- 11 12 (mod 13) 

= 1-2-3-4-5-6 - (—6) • (—5) • (—4) • (—3) ■ (—2) • (— 1) (mod 13) 

= ( 6 !) 2 . 

Thus, 6! is a solution to the congruence x 2 = -1 (mod 13). This is 
easily checked since 6! = 30 • 24 = 4 ■ (-2) = 5 (mod 13). Then, 
of course, - 6! is also a solution. Note that we have produced two 
solutions, x = 5 and x — 8, but we haven’t yet claimed that these 
are the only solutions. 

What makes Lagrange’s idea work — besides Wilson’s theorem 
obviously — is the following: 

Lirst, p needed to be odd so that the first half of the numbers in 
(p - 1)! could be positive and the second half of the numbers, 
working down from p - 1 could be negative. 

Second, and more important, there had to be an even number of 
negative numbers to preserve the sign in the congruence we are 
trying to solve. 

So how do we guarantee that there will be an even number of minus 
signs? Since there are ^ minus signs, we need ^ to be even. Thus, 
we need ^ = 2 n\ that is, we need p to be of the form 4/7 + 1. Then, 
this idea will work. But the idea can't be applied when p is of the form 
4« + 3. 

This may sound like another dead end; but, instead, it is a clue 
because it turns out that we don’t need this idea to work for primes of 
the form 4/7 + 3, since in that case there are no solutions to be found 
anyway, as the next theorem tells us. 
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Theorem 6.4. The congruence x 2 = — 1 (mod p ) has a solution for p = 2, 
and has a solution for an odd prime p if and only if p is of the form 4n + 1 . 

Proof 

For p = 2, x = 1 is the only solution. 

For a prime p = 4n + 1, we use Theorem 6.3, Wilson’s theorem, to 
write 

-1 = (p - 1)! = 1 • 2 • • • (2 ri) ■ (-2 n) ■ ■ • (-2) • (-1) (mod p) 

= 1 • 2 • • • (2 ri) ■ (2 n) • • • (2) • (1) (mod p) 

= ((2n)!) 2 (mod p). 

Thus x = (2 n)l and a: = -(2 n)\ are solutions to x 2 = -1 (mod p). 

In order to prove the converse, we will assume that a is a solution to 
the congruence, and that p = 4n + 3. First, we observe that since a is a 
solution to the congruence, a and p are relatively prime. Then we write 

a p ~ x = a 4n+2 = (a 2 ) 2n+1 = (-1) 2 " +1 = -1 (mod p), 

which is a contradiction, since Fermat’s little theorem tells us that 
a p ~ x = 1 (mod p). Thus, if the congruence has a solution, the odd 
prime p cannot be of the form 4« + 3. This completes the proof. ■ 


Lagrange's Theorem 

We are quite familiar with the fact that a polynomial equation 

a n x n + a n _ix n ~ x + ■•■ + «!* + a o = 0 

of degree n can have at most n solutions, because, for example, we 
know that a cubic polynomial such as x 3 - 2x 2 - x + 2 can cross the 
x-axis at most three times. In fact, a cubic polynomial might have only 
a single solution if the curve crosses the x-axis just once, or possibly 
two solutions if the axis is crossed once and is tangent to the curve at 
another point. 

Therefore, it would come as a tremendous shock to us if a congruence 
such as 


x 2 + 1 = 0 (mod 89) 

were to have more than the two solutions x = 34 and x = 55 that we 
have already found. 
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Similarly, in the proof of Wilson’s theorem, the one thing we had to 
be careful about was to check that for a prime p the only solutions for 
the congruence 

x 2 - 1 = 0 (mod p) 

were the obvious ones, x = 1 and x = — 1. You might have wondered at 
the time why we even bothered to check, since it seemed obvious that a 
quadratic congruence couldn’t have more than two solutions. 

Ah, but it can! It all depends on what the modulus is. For example, if 
the modulus is 8, here is a quadratic congruence that has four solutions: 

x 2 - 1 = 0 (mod 8). 

Each of the four numbers x = 1, 3, 5, 7 is a solution, and none of the 
numbers x = 0, 2, 4, 6 are solutions. 

So, clearly, some congruences of degree n do have more than n 
solutions, and this is a significant difference between congruences and 
polynomial equations. 

Fortunately, in 1768, Joseph Louis Lagrange was able to prove that 
for a prime modulus p, no polynomial congruence of degree n can have 
more than n solutions. In other words, Lagrange proved that, at least 
when the modulus is prime, congruences behave the way we expect 
them to behave. In fact, it was precisely this behavior that we had in 
mind in our reconstruction of Fermat’s proof of Theorem 5.1, where 
we found a contradiction when a congruence, x 2n - 1 = 0 (mod p), 
of degree 2 n was seen to have 4 n solutions. Fermat knew this was 
impossible, and as soon as we prove Lagrange’s theorem, we will also 
know it to be impossible. The proof we give of Lagrange’s theorem is due 
to Euler, to whom Lagrange communicated this very important result in 
about 1770 or 1771. 

Before we give that proof, however, we need to review some basic 
facts about polynomials. One useful fact about polynomials is that if 
a number a is a zero of a polynomial f — that is, if a is a solution for 
the equation f(x) = 0 — then x — a is a factor of the polynomial. (The 
converse is obviously true, since if f(x) = (x - a)g(x), then clearly 
f(a) = 0.) For example, for the polynomial f(x) = x 3 - x 2 - x - 2, we 
can easily see that f( 2) = 8- 4 — 2-2 = 0, and so we know immediately 
that x — 2 must be a factor of x 3 - x 2 — x — 2. 

The reason this is true has to do with what is called the division 
algorithm for polynomials. Just like you can for integers, you can always 
divide one polynomial into another polynomial and get a quotient and 
a remainder, and the same process of long division even works. So, for 
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example, to divide x — 2 into x i - x 2 — x — 2 you begin by dividing x 
into x 3 to get a quotient x 2 , and then multiply x 2 by x - 2 to get x 3 - 2x 2 , 
which is subtracted from x 3 - x 2 - x - 2, yielding a “smaller” polynomial 
remainder x 2 - x - 2. At this point, you again divide x into x 2 to get a 
quotient x, and multiply x by x - 2 to get x 2 - 2x, which is subtracted 
from the remainder x 2 - x - 2 to get another remainder x - 2. Again, 
you divide x into x to get a quotient of 1, and multiply 1 by x - 2 to get 
x - 2, which is subtracted from the remainder x - 2 to get a remainder 0. 
Thus, in this case, the quotient is x 2 + x + 1 and the remainder is 0, so 
we have 


x 3 - x 2 - x - 2 = (x - 2)(x 2 + x + 1). 

Let’s do one more example, this time dividing x - 5 into x 2 - 1 . We 
divide x into x 2 so the first quotient is x, and this we multiply by x - 5 to 
get x 2 - 5x, which we subtract from x 2 - 1 to get a remainder 5x - 1. At 
this point we divide x into 5x to get a quotient 5, and multiply by x - 5 
to get 5x - 25, which we subtract from 5x - 1 to get a remainder 24. Since 
at this point we can no longer divide x into this remainder we are forced 
to stop. Thus, in this case, the quotient is x + 5 and the remainder is 24, 
and so we have 


x 2 - 1 = (x - 5)(x + 5) + 24. 

What is clear about this long division process is that you can just 
keep repeating the division until the point where the remainder is a 
polynomial with a smaller degree than the polynomial you are dividing 
by. This fundamental property about polynomials is known as the 
division algorithm for polynomials : 

Given any two polynomials f and g with g j=. 0, there exist 

unique polynomials q and r such that 

f = qg + r with deg(r) < deg(#). 

Note that even though in our examples all the coefficients were 
integers, it is clear that the coefficients of q and r might need to be 
rational numbers in some cases even when the coefficients of f and 
g are integers. For this reason we usually speak of polynomials over 
the rationals, or polynomials over the reals, as a way of making clear 
exactly what coefficients are allowed. Number systems such as the 
rational numbers, the real numbers, and the complex numbers, in 
which you can always divide by any nonzero number, are called fields. 
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We will soon see other examples of fields. As long as the coefficients 
for your polynomials are coming from a field, the division algorithm 
holds. 

Now, we can use the division algorithm to prove that if a is a zero of 
a polynomial f, then x - a is a factor of the polynomial. We simply 
divide x - a into f, that is, the division algorithm tells us there are 
polynomials q and r such that f — q(x - a) + r, with deg(r) < deg(x -a). 
But deg(x - a) — 1, so r must be a constant, and since a is a zero of f, we 
have 0 = f(a) = q ■ 0 + r = r , and so f — q(x - a), as claimed. 

Can we expect this same property to hold for polynomial congru- 
ences? For example, we have just seen that the polynomial congruence 
x 2 -l =0 (mod 8) has four solutions, x = 1, 3, 5, 7, and it seems highly 
unlikely that each ofx-l,x-3,x-5, and x — 7 could factor x 2 - 1 . 

But they do! For instance, we saw above that 

x 2 - 1 = (x - S)(x + 5) + 24, 

and so, in this case, the remainder 24 is congruent to 0 modulo 8. 
Therefore, 


x 2 - 1 = (x - 5)(x + 5) (mod 8). 

In other words, since 5 is a zero of the polynomial congruence, x - 5 is 
a factor of the polynomial. 

The proof of this basic fact about polynomial congruences is almost 
the same as it was for polynomials. Suppose that a is a zero of a 
polynomial congruence f modulo m ; then, we again simply divide x - a 
into f, and the division algorithm tells us there are polynomials q and r 
such that f = q(x - a) + r, with r a constant. So f(a) = q-O + r = r; but, 
since a is a zero of f modulo m, we also have f(a) = 0 (mod m), and so 
r = 0 (mod m). Thus f = q(x - a) (mod m), and x — a is a factor of f, as 
claimed. 

Theorem 6.5 (Lagrange's theorem). For a prime p, a polynomial congru- 
ence of degree n 

a„x n + a„- iX n ~ l + • • • + a\X + a 0 = 0 (mod p) 
has at most n solutions. 

Proof 

Let f be a polynomial of degree n, and suppose a\ is a solution to the 
congruence f(x) = 0 (mod p). Then x - a \ is a factor of f modulo p, 
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and we can write f = (x — a\)g (mod p), where g is a polynomial of 
degree n — 1. 

If a 2 is another solution to the congruence f(x) = 0 (mod p), we 
haveO= f{a 2 ) = {a 2 —ai)g(a 2 ) (mod p). But a 2 and cq are not congruent 
modulo p, so we can divide by a 2 — a\ and conclude by Theorem 3.7 that 
g(a 2 ) = 0 (mod p). Therefore, x - a 2 is a factor of g modulo p, and we 
can writ eg = (x — a 2 )h (mod p), where h is a polynomial of degree n — 2. 

Thus we now have 


f = (x - a i)(jv - a 2 )h (mod p), 

where h is a polynomial of degree n - 2, and any additional solutions 
must satisfy h(x) = 0 (mod p). We can continue in this way for as long 
as there are additional solutions, and each additional solution provides 
another factor x - a t . Since this process can continue for at most n 
steps— because the degree drops by 1 at each step— there can be at most 
n solutions for the congruence. This completes the proof. ■ 


Problems 

6.1 * (H,S) The proof presented in this chapter of Fermat’s little theorem relied 

on the useful fact that, for a number a relatively prime to a prime p, the 
numbers a, 2a, 3 a, (p — 1 )a represent, in some order, the p - 1 

distinct nonzero remainders 1, 2, 3 p - 1 modulo p. Illustrate 

this using the number a = 5 and the prime 17. 

6.2 ★ (S) Solve the linear congruence 16* = 13 (mod 31) by using Euclid’s 

algorithm to write 1 = r ■ 16 + s • 31. 

Show how this congruence could also be solved rather easily by 
inspection. 

Why is the straightforward method of letting * run through all 
values 0, 1, 2, . . . , 30 not such a good idea for this congruence? 

6.3 (H,S) (a) Show that solving the congruence ax = b (mod m) is equivalent 

to solving the linear Diophantine equation ax - my — b. 

(b) Prove Theorem 6.1 by appealing directly to Theorem 4.1. 

6.4 * (H) We used p = 13 to illustrate the proof of Wilson’s theorem by 

grouping the numbers from 2 through 11— that is, 2 through 
P - 2 —into inverse pairs in order to show Gauss’s beautiful idea that 
(p - 1)! reduced to just the single number p - 1 modulo p, that is, 
to -1. Do the same thing for the prime p = 19 by grouping the 
numbers 2 through 17 into inverse pairs. 
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6.5 (H,S) In the Chinese remainder theorem, the moduli m,- are pairwise 

relatively prime. This is a strong hypothesis on a set of numbers. For 
example, the set of three numbers {6, 10, 15} are not pairwise relatively 
prime, even though their greatest common divisor is 1. 

(a) Show that the following system of congruences does not have a 
solution: 

x = 1 (mod 6); x = 2 (mod 10); x = 3 (mod 15). 

(b) Show that the following system of congruences has a unique 
solution modulo 30, where 30 is the least common multiple of 
the three numbers 6, 10, and 15: 

x = 5 (mod 6); x = 7 (mod 10); x = 2 (mod 15). 

(c) Make a conjecture as to a condition that could be placed on pairs 
of moduli, nij and nij, and pairs of constants, a, and fly, so that a 
solution will exist for such a system of linear congruences. 

6.6 (H,S) The most famous of all remainder problems involves eggs. This 

problem is very old, and was known in several parts of the world, 
appearing for instance in India, Egypt, and in Fibonacci’s Liber abaci. 
There are two versions of this problem; solve both of them. 

The story is that one day a woman was on the way to market with a 
basket of eggs, when she met with an accident and all of the eggs were 
broken. The man who caused the accident wanted to pay for the eggs, 
but the woman did not know how many she had in the basket. 
However, she did remember noticing a strange thing: when she put the 
eggs in the basket two at a time there was one egg left; and yet when 
she took the eggs out of the basket three at a time there was also one 
egg left; and this was also true when she did the same thing with the 
eggs four, five, and six at a time, there was always one egg left over; but 
when she put them back in, seven at a time, there were no eggs left. 
What is the smallest number of eggs she could have had in the basket? 

In the second version, the story is the same, and she again 
remembers having one egg left over when she puts them in two at a 
time, but then she has two, three, four, five, and zero eggs remaining 
for each of the other times she “counts” the eggs. 

6.7 (H) Prove the converse of Wilson’s theorem: 

If (n - 1)! = — 1 (mod n ) for a positive integer n, then n is prime. 
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6.8 (H,S) In order to get an appreciation for how hard John Wilson may have 

worked to come up with his conjecture, verify Wilson’s theorem for 
p — 19 by confirming directly by long division that 19 1 18! + 1. 

See the hint section if your calculator will not compute 18! + 1, or 
use Sage. 

6.9 (H,S) Prove the following two variations on Wilson’s theorem, for any odd 

prime p: 

P+1 

(a) l 2 • 3 2 • 5 2 • • • (p - 2) 2 = ( - 1) 2 (mod p). 

P+1 

(b) 2 2 • 4 2 • 6 2 • ■ • (p - l) 2 = ( - 1) 2 (mod p ). 

6.10 * Show that 2 is a primitive root for the prime 19. That is, show that the 

progression 2, 2 2 , . . . , 2 18 represents all the nonzero remainders from 
1 to 18 modulo 19; and that, in particular, 2 18 is the first term in the 
progression that is congruent to 1 modulo 19. 

6.11 (H,S) In Chapter 7, we will prove that every prime p has a primitive root; 

that is, for each prime p there is a number a > 1 for which p — 1 is the 
least positive integer k such that a k = 1 (mod p). 

Assuming that every prime has such a primitive root, show that you 
can use Fermat’s little theorem to prove Wilson’s theorem. 

6.12 (H,S) Find two solutions for the congruence x 2 = -1 (mod 29) using all 

three methods discussed in the text. Do this in the same order in 
which these methods were discussed, that is: 

1. Try all values of x starting with 1,2,... until you find a solution. 

2. Use Fermat’s little theorem, and the progression 1, 2, 2 2 , 2 3 , . . . 
(this will work, because 2 is a primitive root for the prime 29). 

3. Use the solutions x = (2n)l and x = -(2 ri)\ given in the proof of 
Theorem 6.4. 

See the hint section if your calculator will not compute (2n)! in this 
case, or use Sage. 

6.13 (H) Use a calculator to verify that the solution x = (2 n)l found for the 

congruence x 2 = -1 (mod p) in the proof of Theorem 6.4 produces 
the solution x = 34 when p = 89. 

See the hint section if your calculator will not compute (2 m)! in this 
case, or use Sage. 

6.14 (H,S) You may have noticed that when we illustrated the corollary to 

Theorem 5.3 by writing the number 3185 as a sum of two squares as 
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56 2 + 7 2 one of its prime divisors 7 not only divides 3185 twice — which 
was the point of the corollary — but that 7 also divides each square 
individually; that is, 7 1 56 and 7 1 7. This was no accident. 

Suppose that p is a prime of the form p = An + 3 and that p divides a 
sum of two squares; that is, p \ a 2 + b 2 . Prove that p\a and p\b. 

6.15 (S) Theorem 2.1, which says that there are infinitely many prime 
numbers, leads to an obvious question: 

Are there an infinite number of primes of each of the two forms 

A n + 1 and 4 n + 3? 

The answer to this question is yes. 

(a) Imitate Euclid’s proof of Theorem 2.1 to show that there are an 
infinite number of primes of the form An + 3 by assuming that 
pi , p 2 , . . • , pk are all the primes of the form An + 3. Then let 
N — A ■ p\ ■ p 2 ■■ ■ pk — 1, and reach a contradiction. 

(b) Explain why this same approach doesn’t work so easily for 
primes of the form An + 1. In other words, what goes wrong if we 
assume that p\, p 2 , ■ ■ ■ , pk are all the primes of the form An + 1; 
then let N = 4 • pi • p 2 • • • pk + 1 , and try to reach a contradiction. 

(c) Show how Theorem 6.4 makes it possible to use Euclid’s elegant 
idea after all for primes of the form An + 1, because you can 
assume that pi, p 2 , . . . , p* are all the primes of the form 4/7+1, 
and let N = (2 • pi • p 2 • • • pk) 2 + 1 to reach a contradiction. 

Recall that we were able to reach the same conclusion using a 
slightly different method in Problem 5.17. 

(d) Use the idea in part (a) to prove that there are an infinite number 
of primes of the form 6n + 5; that is, the arithmetic 
sequence 


5, 11. 17. 23, 29, 35, 41, 46, 

contains an infinite number of primes. 

6.16 (H,S) The two cubic congruences 

x(x - l)(x - 2) = 0 (mod 3) and x 3 - x = 0 (mod 3) 

have something interesting in common. They have exactly the same 
solutions! 

The solutions for the first congruence, x = 0, 1,2, are obvious; these 
same solutions for the second congruence follow from Fermat’s little 
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theorem. 

(a) Show that this is no accident, since x(x - 1)(a: - 2) = x 3 - x 
(mod 3). 

(b) Repeat this for the prime p = 5 by showing that the two 
polynomial congruences 

*(*-l)(*-2)U-3)(*-4) = 0 (mod 5) and x 5 -x = 0 (mod 5) 
have the same solutions; and that 

x(x - 1)(* - 2)(x - 3)(a: - 4) = x 5 - x (mod 5). 

(c) The crucial fact that we used in the proof of Lagrange’s theorem 
was that if a is a zero of a polynomial congruence f, then x - a is 
a factor of f. Use this fact to prove that, in general, for a prime p, 

x(x - l)(x - 2) • • • (x - (p - 1)) = x p - x (mod p). 

(d) Use part (c) to prove Wilson’s theorem. 


7 

Euler and Lagrange 


A New Beginning 

Modern number theory burst forth in a single blaze of glory from the 
mind and the work of Pierre de Fermat, but with his death in 1665 it 
was to lie dormant— almost to disappear completely— as the mathemat- 
ics and the mathematicians of the latter part of seventeenth century 
became occupied with the recent discovery of calculus by Newton and 
Leibniz. Even the way in which mathematics was practiced at this time 
was rapidly changing into something more closely resembling what we 
see today, where mathematicians hold positions at universities, publish 
their work in professional journals, and participate in the activities of 
scientific societies. 

For nearly a century, there was no one to pick up and carry forward 
the torch that had been lit by Fermat. Remarkably, though, the flame 
did not die out altogether, and by 1730 the time was again ripe for a 
new beginning for number theory. This time, however, it was not a 
jurist and part-time amateur mathematician who would take up the 
cause of this ancient, though sometimes forgotten and ignored, topic in 
mathematics, but one of the greatest and most prolific mathematicians 
of all time, who was about to discover the delights of number theory in 
the dying embers of Fermat’s work. 

In early December 1729, in St. Petersburg, where he held a chair at the 
Academy of Sciences, Leonhard Euler received a letter from Christian 
Goldbach in Moscow. In this letter, Goldbach first replied to some 
matters contained in an earlier letter written to him by Euler, and then 
he appended a single, unrelated question: 

Is Fermat’s observation known to you, that all numbers 2 2 "+ 1 are 
primes? He said he could not prove it; nor has anyone else done so 
to my knowledge. 

Euler didn’t rise to the bait immediately, but, by June 1730, he had be- 
gun reading the works of Fermat, and he was hooked for life, especially 
by Fermat’s claim that every number is a sum of four squares. 

Euler’s collected works fill eighty-seven volumes, a prodigious output 
that alone can almost fill a small library. His work ranged over all of the 
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Figure 7.1 Leonhard Euler, 1707-83. 

mathematics and much of the science of his day. He saw connections 
everywhere. In particular, he could see connections between an area 
of mathematics such as analysis— he published, in 1748, Introductio in 
analysin infinitorum, the first of his three great textbooks on calculus 
that solidified and extended the methods of Newton and Leibniz— and 
the vastly different area of number theory, and wrote 

From this one can see how closely and wonderfully infinitesimal 
analysis is related not only to ordinary analysis, but even to the 
theory of numbers, however inconsistent this may seem to this 
higher calculus. 

This was a remarkably prescient observation. By the twentieth cen- 
tury, the study of number theory would increasingly rely on techniques 
from this “higher calculus,” the field of mathematics that today we call 
analysis. In fact, so much so, that the term elementary number theory is 
often used to refer to that part of classical number theory that is done 
without using any techniques from analysis — in other words, the way 
in which we are studying number theory in this book. 

Nicolas Condorcet, one of the leading mathematicians throughout 
the entire period of turmoil that was the French Revolution (he would 
die in prison in 1794) was called upon by the Paris Academy of Science 
to write Euler's eulogy in order to give an account of “Euler’s immense 
scientific labours.” He closed his eulogy with this unforgettable image 
of Euler’s final moments: 

On the 7th of September, 1783, after amusing himself with 
calculating on a slate, the laws of the ascending motion of 
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air-balloons, the recent discovery of which was then making a 
noise all over Europe, he dined with Mr. Lexell and his family, 
talked of Herschell’s planet, and of the calculations which 
determine its orbit. A little after he called his grand-child, and 
fell a playing with him as he drank tea, when suddenly, the pipe, 
which he held in his hand, dropped from it, and he ceased to 
calculate and to breathe. 

This passage is particularly moving to us in this 1795 translation 
by Henry Hunter as a result of the poetic change he made in the last 
words of the phrase “il cessa de calculer et de vivre” from “to live” 
to “to breathe,” which now echoes another very famous quote that cap- 
tures the ease with which Euler calculated, this by the French physicist 
Francois Arago: 

He calculated without apparent effort, as men breathe, or as 
eagles sustain themselves in the wind. 

There is one final quote about Euler any mathematician would 
do well to heed even today, and that is the well-known quote by 
Pierre Simon de Laplace, a contemporary of Lagrange, Legendre, and 
Condorcet: 

Read Euler, he is our master in everything. 


Euler's Phi Function 

In 1760, Euler gave a generalization of Fermat’s little theorem that 
involves the positive integers that are less than and relatively prime to 
an integer n. For example, if n = 12, then 1, 5, 7, and 11 are relatively 
prime to 12; whereas, if n = 13, then 1, 2, 3, ... , 12 are relatively prime 
to 13. 

Euler used n(n) to denote the number of positive integers less than 
and relatively prime to n, although we now follow Gauss in his Disqui- 
sitiones Arithmeticae and use the notation <p(n), and we refer to this as 
Euler’s phi function. Thus 0(12) = 4 and 0(13) = 12. 

Euler discovered that, by using this function 0 (h), he could generalize 
Fermat’s little theorem to include composite numbers n. Fermat’s little 
theorem requires the modulus p to be prime. But, with Euler’s general- 
ization, using 0(h), the modulus can be composite. 

For example, we observe that 5 is relatively prime to 12 and that 
50 ( 12 ) = i (mod 12); that is, 5 4 = 1 (mod 12), which we can easily 
verify, since 5 4 = (5 2 ) 2 = (25) 2 = l 2 = 1 (mod 12). 
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What Euler discovered was that if a is relatively prime to n, then 
a <t>(n) = i ( moc i n y Th e reason this “generalizes” Fermat’s little theorem 
is that if n is prime, then = n - 1. For example, for n = 13, 
we see that saying 5^ (13) = 1 (mod 13) is actually the statement that 
5 12 = 1 (mod 13), which, after all, is just what Fermat’s little theorem 
says. 

At the heart of Euler’s insight into this remarkable result is the 
following very familiar idea (see Problem 6.1). If we take the four positive 
numbers 


1, 5, 7, 11 

that are less than and relatively prime to 12, and multiply them by a 
number a = 5 that is also relatively prime to 12, then we get 

5, 5-5 = 25 = 1, 7-5 = 35 = 11, 11 • 5 = 55 = 7 (mod 12) 

and we can see that we have exactly the same four numbers 5, 1, 11, 
and 7 we started with, just in a different order. 

A set of </>(«) mutually incongruent integers a\, a 2 , ... , a^ n) that are 
relatively prime to n is called a reduced system of residues modulo n. For 
example, (1, 5, 7, 11} is a reduced system of residues modulo 12. The set 
{5, 25, 35, 55} is also a reduced system of residues modulo 12. Note that 
for a prime such as n = 13, the set {1, 2, 3, ... , 12} is a reduced system 
of residues modulo 13. 

We now have terminology for two different systems of residues, 
which at times can be confusing. Just remember that, modulo n, a 
complete system of residues means you have a set with n numbers that 
represent all possible remainders from 0 to n - 1; whereas a reduced 
system of residues means you have a set with cf>{n) numbers that rep- 
resent all remainders relatively prime to n. The main problem with this 
terminology is that you always have to remember that we are reducing 
a complete set of remainders down to just those remainders relatively 
prime to n. 

The key observation to make about reduced residue systems is that 
if fli, a 2 , . . , , U0(„) is a reduced residue system, and a is relatively prime 
to n, then aa\, aa 2 , . . . , aa^ is also a reduced residue system. This is 
why multiplying the reduced residue system {1, 5, 7, 11} by 5 produced 
another reduced system modulo 12, the set {5, 25, 35, 55). 

This basic fact allows us to prove Euler’s theorem, using the same 
proof we used in Chapter 6 to prove Fermat’s little theorem. 
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Theorem 7.1 (Euler). If a and n are two relatively prime integers, then 

j (mod ri). 


Proof 

Let «i, fl2, . . . , be a reduced residue system modulo n; then, we 
claim that 


dd\ , CKX2.1 • • • ? ##</>(«) 

is also a reduced residue system modulo «. 

First, we see that these numbers are distinct modulo n. We can argue 
almost exactly as we did in the proof of Fermat’s little theorem in 
Chapter 6 by supposing that two of these numbers are not distinct; that 
is, for a, / aj, suppose that aa ,- = aa ; - (mod ri). Then, by Theorem 3 . 7 , 
since a and n are relatively prime, we can divide by a, and get a, = a ; 
(mod n). But this is a contradiction, since a, and are two distinct 
integers in a reduced residue system, so they can’t be congruent. 

Next, we see that the numbers flfli, aaz, . . . , aa^,,) are relatively 
prime to n, because a is relatively prime to n, and each a,- is also relatively 
prime to n. Therefore, by Euclid’s lemma, Theorem 3 . 3 , each aa t is 
relatively prime to n. 

Lienee, as claimed, the set 

flfli, flfl2, . . . , aa^( n ) 

is a reduced residue system modulo n. In other words, these tp(ri) num- 
bers are congruent, in some order, to the numbers a\, fl2 a<p (, ,). 

So we can write 


(flfli )(fl«2) • • • (<M0(»)) = fll«2 • • • a cp(n) (mod n), 
and then divide this congruence by each of fli, «2, . . . , a^ {n) to get 

a <t>(n) s 2 (mod ri). 

This completes the proof. ■ 

It turns out that Euler’s phi function has a very important property 
that we refer to by saying that </> is multiplicative. The multiplicative 
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property for a function has enormous consequences, because whenever 
we know a function is multiplicative, as soon as we know its value for 
prime powers, then we know its value for all integers. A function such as 
0 is said to be multiplicative if whenever a and b are two relatively prime 
integers, then 


cp(ab) = (p(a)(t>(b). 

For example, if we want to evaluate 0(30), we could do so by actually 
finding all the numbers less than 30 that are relatively prime to 30, or 
because 0 is multiplicative we can simply evaluate each of 0(6) and 0(5) 
separately, and then get 

0(30) = 0(6) • 0(5) = 2-4 = 8. 

Note that there are in fact eight numbers less than 30 that are relatively 
prime to 30: 1, 7, 11, 13, 17, 19, 23, 29. 

More generally, and this is the useful point, for an integer n whose 
prime decomposition is n = p“> p a 2 2 ■ ■ ■ pf, we can evaluate <p{n) in 
terms of the prime powers: 


0(») = 0(p?')-0(p“ 2 )..-0(p“‘). 


For Euler’s phi function this is very good news, because we can 
determine 0(p“) for a prime p in the following way. There are p a - 1 
positive numbers less than p a . The only numbers among these that are 
not relatively prime to p a are 


p, 2 p, 3 p, 4p, 5 p, , (p 01 - 1 - l)p, 

and there are exactly p"” 1 - 1 of these numbers. The rest of these 
numbers are relatively prime to p". Thus, 


0(P“) = (P a - 1) - ( p °'~ 1 - 1) = p“ - p a ~\ 

This immediately gives us a formula for 0(n) in terms of the prime 
decomposition of n: 


First Form: 0(n) = (p" 1 - p“' J )(p“ 2 - p“ 2 x ) • • . (p“* - p£* -1 ). 
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And, by factoring pf out of each term, this formula can be rewritten in 
a more attractive, and sometimes more convenient, form: 

Second Form: (pin) = n ^1 - ^1 - ■ ■ ■ ^1 — . 

We can illustrate both forms of this formula for <p{n) by computing 
0(200). Since 200 = 2 3 • 5 2 , we get, using the first form of this formula, 

0(200) = (2 3 - 2 2 )(5 2 - 5 1 ) = 4 • 20 = 80; 


and, using the second form, we get 

0(200) = 200(1 - i)(l - i) = 200 • \ ■ | = 80. 

However, in order to arrive at this formula for 0, in either form, we 
do need to know first that 0 is multiplicative. Gauss, not surprisingly, 
used congruences to give a proof. Here is how his proof goes. Let a and 
b be two relatively prime positive integers. Then, for each of the 0(a) 
positive numbers r less than and relatively prime to a, and each of the 
0(fi) positive numbers s less than and relatively prime to b, we can write 
down a system of congruences 

x = r (mod a) and * = s (mod b). 

Since there are 0(a)0(i>) pairs of numbers r and s, we can write down a 
total of 0(a)0(i>) such systems of congruences. 

By the Chinese remainder theorem, Theorem 6.2, each system of 
congruences has a unique solution modulo ab, so these systems have a 
total of 0(a)0(fi) solutions. The proof can now be completed by showing 
that this set of solutions is a reduced residue system modulo ab. This 
is because a reduced residue system modulo ab has 4>(ab) elements, 
and so it will then follow that 0(a)0(fi) = <p(ab), and, therefore, 0 is 
multiplicative. 

The details of showing that the set of solutions of these systems 
of congruences is a reduced residue system modulo ab will be left for 
Problem 7.5. It still needs to be shown that each solution is distinct; that 
each is relatively prime to ab; and that, moreover, any positive number 
less than and relatively prime to ab would also be a solution to one of 
these systems of congruences. 
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Primitive Roots 

In Chapter 6, we encountered the useful notion of a primitive root for a 
prime p when we used the fact that 3 is a primitive root of 89 as one of 
our methods for solving the quadratic congruence x 2 = -1 (mod 89). 

At the time, we didn’t bother to verify that the 88 numbers in the 
set (3. 3 2 , 3 3 , . . . , 3 88 ) are congruent, in some order, to the numbers 
1, 2, .... 88, but that is exactly what it means to be a primitive root. 
In other words, a primitive root for a prime p generates all the numbers 

1, 2, . . . , p- 1. 

You saw another example of a primitive root in Problem 6.10 when 
you verified that for the prime 19, the powers 2, 2 2 , . . . , 2 18 produce, 
modulo 19, 

2, 4. 8, 16, 13, 7, 14, 9, 18, 17, 15, 11, 3, 6, 12, 5, 10, 1; 

that is, once again, a primitive root, 2, modulo a prime, p = 19, 
generates all of the numbers from 1 to p - 1. 

1 his is another situation where the terminology is not very helpful. 
Although we will continue to use the term primitive root, you need 
to think of a primitive root as a generator. For a prime p, a primi- 
tive root generates all the numbers 1, 2, . . . , p - 1. This is incredibly 
powerful, and our main goal in this section will be to prove— as Euler 
conjectured— that every prime has a primitive root! 

How can we extend the notion of a primitive root of a prime p to 
composite numbers? Well, when a primitive root a for a prime p is 
generating the numbers from 1 to p — 1, as it runs through the powers 
from a to a p ~ 1 , the set of numbers that a is generating is nothing more 
nor less than a reduced system of residues modulo p. That’s what the set 
{1, 2, . . . , p - 1} is, a reduced residue system. 

We now give a more general definition of a primitive root, and say 
that a positive integer a is a primitive root of a positive integer n if the 
least positive integer k for which a k = 1 (mod n) is 4>(n). 

Notice how clean this definition is. It doesn’t say anything about 
generating reduced residue systems; but, if 3 88 is the first power of 3 that 
is congruent to 1, then {3 , 3 2 , . . . , 3 88 } has to represent the reduced 
residue system {1, 2, ... , 88} — end of story! How do we know there 
are no repetitions in 3, 3 2 , . . . , 3 88 ? Suppose 3 k = 3 ' (mod 89) for 
two terms in this progression where j < k. Then, by Theorem 3.7, we 
can divide by 3 > , and get 3 k ~ ’ = 1 (mod 89). But this is impossible, 
since 3 88 is the first power of 3 in the progression that is congruent 
to 1. Therefore, the 88 numbers in the set {3 , 3 2 , . . . , 3 88 } are distinct 
modulo 89; hence they have to be congruent, in some order, to the 
88 numbers in the set {1, 2. , 88}. 
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Now, exactly the same idea works for a positive integer n that may be 
composite. We know by Theorem 7.1 that if a and n are relatively prime, 
then = 1 (mod n). But if a is a primitive root of n, then is the 
first power of a that is congruent to 1 ; so the set 

{a, a 2 , . . . , fl 0( " ) } 

has to represent a reduced residue system modulo n. Again, sup- 
pose a k = a' (mod n) for two terms in this progression where 
/' < k. Then, by Theorem 3.7, we can divide by a’, and get 

a k ~i = 1 (mod n). But this is impossible, since a^ n) is the first power of 
a in the progression that is congruent to 1. Therefore, the 4>(n) numbers 
in the set {a, a 2 , ... , a are distinct modulo n. Hence, since these 
numbers are also relatively prime to n, this set is a reduced residue 
system modulo n. 

Once again, we see that, even for composite numbers, we can think 
of a primitive root as a generator, in the sense that a primitive root a 
generates a reduced residue system. Let’s look at an example: n = 9. By 
the formula for </>(«), we see that 0(9) = 0(3 2 ) = 3 2 - 3 1 = 6. Can we 
find a value for a such that the set {a, a 2 , . . . , a 6 } is a reduced residue 
system? Yes, a = 2 will work, since 2 6 is the first power of 2 that is 
congruent to 1 modulo 9; that is, 2 is a primitive root of 9. If we reduce 
the set of six numbers 2, 2 2 , 2 3 , 2 4 , 2 s , 2 6 modulo 9, we get 

2, 4, 8. 7, 5, 1. 

These are exactly the six numbers less than and relatively prime 
to 9. Thus the primitive root 2 generates a reduced residue system 
(1, 2, 4, 5, 7, 8). 

Not all composite numbers have primitive roots, however. For exam- 
ple, 12 has no primitive root. By the second form of the formula for <p{n), 
we see that 0(12) = 0(2 2 ■ 3) = 12(1 - (1 ~ 3 ) = 12 ' 2 ' i = 4 - 

Any number a relatively prime to 12, then, has to be congruent to one 
of the four numbers less than and relatively prime to 12: 1,5,7, 11. So 
these four numbers are the candidates for a primitive root. Obviously, 
1 is not a primitive root. Let’s try a = 5. Is 5 4 the first power of 5 that 
is congruent to 1? No, it isn’t, since 5 2 = 25 = 1 (mod 12), 5 isn’t a 
primitive root. Similarly, we get 7 2 = 49 = 1 (mod 12) and ll 2 = 121 = 
1 (mod 12), and so we conclude that 12 has no primitive root. 

Euler conjectured that any prime p has a primitive root, but he was 
never able to provide a proof. It was Legendre who supplied a proof by 
making clever use of Lagrange’s theorem, Theorem 6.5. Before we can 
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look at Legendre's proof, it will be helpful to have one additional bit of 
terminology. 

For a positive integer a that is relatively prime to a positive integer 
n, we say that the order of a modulo n is the least positive integer 
k such that a k = 1 (mod n). So, for example, we just saw that the 
order of 2 modulo 9 is 6, and that the order of 5 modulo 12 is 2. The 
definition of primitive root can be restated more concisely using this 
new terminology: a is a primitive root of n if the order of a is 4>{n). 

The key to Legendre’s proof is a basic fact about products of num- 
bers and their orders. Let’s look at a simple example that shows how 
Legendre’s proof works. 

Let p = 7, and let’s see how to find a primitive root for p. We need 
to find a number a whose order is p - 1, that is, whose order is 6. But 
6 = 2 • 3, so what we do is find a number b whose order is 2, and 
a number c whose order is 3; then, the number a = be will have 

order 6. In this case, we can let b = 6, since 6 2 = 36 = 1 (mod 7), so 

b has order 2. And we can let c = 2, since 2 3 = 8 = 1 (mod 7), so c has 
order 3. Then a = be = 12 = S (mod 7), and we claim that 5 has order 
6. This is easy to check, since, modulo 7, we get 

5 2 = 25 = 4, 5 3 = 5(4) = 20 = 6, 5 4 = 5(6) = 30 = 2, 

5 s = 5(2) = 10 = 3, 5 6 = 5(3) = 15 = 1 (mod 7); 

and so 5 is a primitive root of 7. 

The main idea of Legendre’s proof, then, is to write p - 1 in its 
canonical prime decomposition 

P ~ 1 = P 1 P 2 ■ ■ ■ P? 

and find numbers ri\, n 2 , • • ■ , whose orders are p“\ p 2 2 , . . . , p£\ 
respectively. Then the order of a = n x n 2 ■ ■ ■ n k should be p - 1; that 
is, a is a primitive root of p. Two things must be done to support 
Legendre’s idea and this claim. 

The first, which is fairly straightforward and is left as an exercise in 
Problem 7.13, is to show that if two numbers b and c have orders r and s, 
respectively, and if r and s are relatively prime, then the product a — be 
has order rs. It is this multiplicative property of order that allows us to 
conclude above that a has order p - 1. 

The second thing that must be done to support Legendre’s idea is to 
show that for each prime p,- you can find an integer n, that has order p"' . 
This is where Lagrange’s theorem comes in. 
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Theorem 7.2. Every prime has a primitive root. In fact, if p is a prime, then 
p has exactly <f>(p - 1) primitive roots. 

Proof 

We need to show that for a given term pf in the prime decomposition 
of p - 1 we can find an integer n, that has order pf . Such a number «,■ 
would be a solution to the congruence 

x Pt ' = 1 (mod p). 

But, of course, this congruence also has solutions whose orders are 
less than pf; these orders would be 1 , p,. pf, . . . , up to and including 
pf~ • Each of these other solutions, then, would themselves also be 
solutions to the congruence 

JXj - 1 

x Pi = 1 (mod p). 

Thus any solution to the first congruence that is not a solution to this 
second congruence has order pf and is a primitive root. So, to conclude 
the proof that p has a primitive root, all we need to know is that the first 
congruence has more solutions than the second congruence. In fact, as 
we will soon see, the first congruence has exactly pf solutions, and the 
second congruence has exactly pf~ l solutions. Hence, there are exactly 
pf - pf~ l numbers «; that have order pf. So, by the first form of the 
formula for 0(«) on page 193, we see that this method appears to be 
producing exactly the (p(p - 1) primitive roots for p that we want. 

And it is! However, since there might be primitive roots that are not 
produced in this manner, or even some duplicates, it is easier for us at 
this point to confirm the actual number of primitive roots some other 
way. The idea is that once we know there is at least one primitive root 
a, we can list the numbers 1, 2, . . . , p - 1 as powers of this primitive 
root in some order: a, a 2 , ... . a p ~^ . A number a r in this list will itself 
be a primitive root of p if and only if r is relatively prime to p - 1. Thus 
p has exactly <p{p - 1) primitive roots (see Problem 7.11). We still need to 
show that there is at least one primitive root. 

The reason we know exactly how many solutions each of these two 
congruences given above have is because in each case the exponent of x 
divides p - 1. In other words, it is true in general that if d\(p - 1), then 
the congruence 

x d = 1 (mod p) 

has exactly d solutions. The proof of this fact, which depends heavily on 
Lagrange's theorem, will be left for Problem 7.14, but we will illustrate 
the idea with an example. 
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Let p ~ 13 and d = 3, so that d 1 12. We wish to prove that the 
congruence x 3 = 1 (mod 13) has exactly 3 solutions. First, we write 

X 12 - 1 — (x 3 - l)(x 9 + X 6 + X 3 + 1). 

Now, we know that x 12 - 1 = 0 (mod 13) has twelve solutions by 
Fermat’s little theorem. But, by Lagrange’s theorem, Theorem 6.5, the 
congruence x 9 + x 6 + x 3 + 1 = 0 (mod 13) can have at most nine 
solutions. Therefore, the congruence x 3 - 1 = 0 (mod 13) has at least 
three solutions, but since— again, by Lagrange’s theorem— it can have 
no more than three solutions, it must have exactly three solutions. 
Thus, x 3 = 1 (mod 13) has exactly three solutions. This completes the 
proof. ■ 


Euler's Identity 

Although Euler ultimately failed in his many efforts to prove Fermat’s 
claim that every number is the sum of four squares — the first proof was 
given by Lagrange — he did discover a truly remarkable identity for four 
squares. We will refer to this identity as the fundamental identity of Euler: 

{a 2 + b 2 + c 2 + d 2 )(a 2 + f 2 + y 2 + 8 2 ) 

— (aot T bf T cy + d8) 2 T (af — bot — c8 4- dy ) 2 
+ (ay + bS - ca - df) 2 + ( a8 - by + cf - da) 2 . 

Euler’s identity for four squares is the analogue of the fundamental 
identity of Fibonacci from Liber quadratorum for the sum of two squares 
given on page 118. 

The reason this identity is so extremely important is that it reduces 
the question of writing numbers as sums of four squares to one of 
showing that primes can be written as sums of four squares. For example, 
since 7 = 2 2 + l 2 + l 2 + l 2 and 19 = 4 2 + l 2 + l 2 + l 2 , we can use this 
identity to write the composite number 133 as 

133 = 7 • 19 = (2 2 + l 2 + l 2 + 1 2 )(4 2 + l 2 + l 2 + l 2 ) 

= (8 + 1 + 1 + l) 2 + (2 - 4 - 1 + l) 2 
+ (2 + 1 - 4 - l) 2 + (2 - 1 + 1 - 4) 2 
= ll 2 + 2 2 + 2 2 + 2 2 , 


that is, as a sum of four squares. 
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In this same way, any composite number can be written as a sum of 
four squares, provided the primes in its prime decomposition can be 
written as a sum of four squares. Moreover, by Theorem 5.1, since 2 and 
primes of the form An + 1 can be written as a sum of two squares, it is 
only primes of the form 4n + 3 that we need to be concerned about. 

With this astonishing identity, Euler had gotten very close to being 
able to prove that any number is the sum of four squares. In the next 
section we will see that he provided one more important link that would 
make Lagrange’s proof possible. 


Quadratic Residues 

Our goal — as it was for much of Euler’s life — is a proof of Lagrange’s 
famous theorem that every number can be written as the sum of four 
squares. As is often the case, in order to move toward this goal, we need 
to introduce a new concept. The notion of quadratic residues is a topic we 
will discuss in great detail in the next chapter, but in this section we will 
see how Euler made use of basic properties of quadratic residues in his 
unrelenting attempt to prove the four squares theorem. 

For a given prime p, a number a that is relatively prime to p is called a 
quadratic residue of p if it is congruent to a square modulo p; otherwise, 
a is called a quadratic nonresidue. For example, 6 is a quadratic residue 
of the prime 19, because 5 2 = 25 = 6 (mod 19); that is, 6 is a “square” 
modulo 19. 

For any odd prime p, only half of the numbers 1, 2, .... p - 1 will 
be quadratic residues, and the other half will be quadratic nonresidues. 
There are two ways to see this. 

The most natural way to see this is that the quadratic residues are 
given by 


l 2 , 2 2 , . . . , (V) 2 - 

The next squares, (^) 2 , (^) 2 , • • • > (p - l) 2 , just repeat these same 
numbers all over again, since 


p + 1 _ P+3 _ p— 3 

2 ~ 2 ’ 2 — 2 ’ 


p - 1 = -1 (mod p); 


so, among 1 , 2, .... p - 1 there are quadratic residues and, there- 
fore, quadratic nonresidues. 

For example, we see above that 14 2 = 196 = 6 (mod 19), which just 
repeats the quadratic residue 6 already produced by 5 2 ; this is because 
14 = 19 — 5 = —5 (mod 19). 
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The other way to see this same thing is to let a be a primitive root 
of p. Then the quadratic residues are just the even powers of a, that 
is, a 2 , a 4 , ... , a p ~^; and the quadratic nonresidues are the odd powers, 
a. a 3 , ... . a p ~ 2 . This works because it is only the even powers that can 
be squares; for example, a 6 = (a 3 ) 2 . 

Let's look at the quadratic residues for the prime p = 19 using both 
of these ideas. First, then, the quadratic residues are 

1, 4, 9, 16, 5 2 = 6, 6 2 = 17, 7 2 = 11, 8 2 = 7, 9 2 = 5 (mod 19); 

that is, 1, 4, 5, 6, 7, 9, 11, 16, 17; and so, the quadratic nonresidues are 
the numbers from 1 to 18 that are left: 2, 3, 8, 10, 12, 13, 14, 15, 18. 

We can reach the same conclusion by using the primitive root 
2 for the prime 19. On page 195 we computed all the powers 
2, 2 2 , . . . , 2 18 ; the even powers 2 2 , 2 4 , ... , 2 18 produced the quadratic 
residues 4, 16, 7, 9, 17, 11, 6, 5, 1, and the odd powers 2, 2 3 , . . . , 2 17 
produced the quadratic nonresidues 2, 8, 13, 14, 18, 15, 3, 12, 10. 

By thinking of quadratic residues as those powers of a primitive root 
a having even exponents, and quadratic nonresidues as having odd 
exponents, we have two more properties of quadratic residues: 

(i) the product of two quadratic residues, or of two quadratic 
nonresidues, is a quadratic residue; 

(ii) the product of a quadratic residue and a quadratic 
nonresidue is a quadratic nonresidue. 

These properties follow immediately from basic properties of expo- 
nents. For example, the first part of property (i) follows because if a r and 
a s are two quadratic residues, then r and s are even; so r +s is even, which 
means that the product a r a s = a r+s is a quadratic residue. 

Or let’s look at a specific example of property (ii) modulo 19. The 
product of a quadratic residue 7 and a quadratic nonresidue 13 is 7 - 13 = 
91 = 15 (mod 19). Why does 15 turn out to be a quadratic nonresidue? 
This is immediately clear if we write 7 as 2 6 , and 13 as 2 s , because then 
7 ■ 13 becomes 2 6 ■ 2 s = 2 11 , which is a quadratic nonresidue, since 11 is 
odd. 

It is these very basic properties of quadratic residues that Euler used 
in a clever way to prove the following theorem, a theorem which later 
became the key link for Lagrange in his proof of the four squares 
theorem. 

Theorem 7.3. If p is an odd prime, then the congruence 

x 2 + y 2 = - 1 (mod p) 


has a solution. 
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Proof 

Recall from Theorem 6.4 that if p is a prime of the form An+1, then the 
congruence 


x 2 = -1 (mod p) 

has a solution. Therefore, in the case where p is of the form 4 n + 1, we 
can set y = 0, and the congruence x 2 + y 2 = -1 (mod p) has a solution. 

Hence we need only consider the case p = 4 n + 3. We write the 
congruence as 


x 2 + 1 = -y 2 (mod p). 

Now, in this case, where p = 4» + 3, the congruence x 2 = -1 
(mod p) has no solution (again this is by Theorem 6.4). Therefore, -1 
is a quadratic nonresidue of p. 

Now, by property (ii) above, this means that -y 2 is also a quadratic 
nonresidue for any y (since the product of a quadratic residue, y 2 , and a 
quadratic nonresidue, -1, is a quadratic nonresidue). 

Therefore, we can solve this congruence by letting -y 2 be the first 
quadratic nonresidue in the sequence 1, 2, .... p - 1, and letting x 2 
be the preceding integer (which is then necessarily a quadratic residue). 
Note that since -y 2 is a quadratic nonresidue, y 2 = -(-y 2 ) is a quadratic 
residue (by property (i) above), so we will be able to solve for y. This 
completes the proof. ■ 

We can illustrate Euler’s proof of this theorem using p = 19. The first 
quadratic nonresidue in the list 1, 2, .... p - 1 is 2, so we set -y 2 = 2 
and x 2 — 1. Thus x = 1 and y 2 = -2 = 17 = 36 (mod 19), and so 
y = 6. Hence x = l,y = 6isa solution to the congruence x 2 + y 2 = 
-1 (mod 19); that is, l 2 + 6 2 = 37 = -1 (mod 19). 

We conclude this very brief introduction to quadratic residues with 
one final observation: the clever trick that Euler used with —1 in this 
proof works in general for primes of the form An + 3 to show that if a is 
a quadratic residue, then -a, that is, p — a, is a quadratic nonresidue, 
and vice versa. For example, for p = 19, the quadratic residues are 
1, 4, 5, 6, 7, 9, 11. 16, 17, and so the quadratic nonresidues are 

-1, -4, -5, -6, -7, -9, -11, -16, -17. 


that is, 18, 15, 14, 13, 12, 10, 8, 3, 2. 
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Figure 7.2 Joseph Louis Lagrange, 
1736-1813. 


Lagrange 

It is almost inconceivable to us now, from our perspective in the twenty- 
first century, at a time when travel is so easy, to realize that the two great 
mathematicians of the eighteenth century, Euler and Lagrange, never 
met one another. Yet their lives and their mathematics were deeply 
intertwined. We tend to think of Lagrange as a French mathematician — 
and the French certainly do— but he was born in Turin as Giuseppe 
Lodovico Lagrangia. And at the age of nineteen he was appointed as 
professor of mathematics at the Royal Artillery School in Turin. 

In that same year, 1755, Lagrange wrote for the first time to Euler— his 
elder by almost thirty years, and who was by then at the Royal Academy 
in Berlin— about methods he had developed for solving problems in 
an area we now call the calculus of variations involving curves such 
as the tautochrone, a curve with the remarkable property that if two 
point masses begin falling simultaneously from two points anywhere 
on the curve, and fall along the curve only under the influence of 
gravity, then they will reach a given fixed point at exactly the same 
time. Euler was tremendously impressed by the young Lagrange and 
in 1756, even arranged for him to be offered a position in Berlin, but 
Lagrange declined. However, when Euler left Berlin in 1766 to return to 
St. Petersburg, it was Lagrange who took his place, and he was to remain 
there for twenty years as director of mathematics, before eventually 
moving to Paris. 
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We have already mentioned many of Lagrange’s major contribu- 
tions in number theory: his proof of Wilson’s theorem, Theorem 6.3; 
his theorem on the number of solutions of a polynomial congruence, 
Theorem 6.5; and in the next section we will see his crowning achieve- 
ment in number theory, his four squares theorem, Theorem 7.4. But 
Lagrange also made vital contributions across the entire field of math- 
ematics: he invented the method known as variation of parameters for 
solving differential equations; it perhaps goes without saying that the 
method of Lagrange multipliers is due to him; while in Berlin he wrote his 
great treatise Mecanique Analytique on mechanics; and one of the most 
fundamental theorems in group theory is called Lagrange’s theorem, 
which says that the order of any subgroup divides the order of the group 
(thus generalizing the important corollary to Fermat’s little theorem 
about the order of a dividing p - 1). 


Lagrange's Four Squares Theorem 

With Theorem 7.3 in hand, we can now proceed with the proof that 
every number can be written as a sum of four squares — that is, with a 
proof of Lagrange’s four squares theorem. By Theorem 7.3, if p is an odd 
prime, the congruence x 2 + y 2 = -1 (mod p) has a solution, so we can 
write 


kp = x 2 + y 2 + l 2 + 0 2 ; 

that is, Theorem 7.3 tells us we can write a multiple of p as a sum of four 
squares. 

Now we will continue much as we did in the proof of Theorem 5.1, 
using a descent argument to produce a smaller multiple of p that is also 
a sum of four squares. Thus p itself must be a sum of four squares. 

In the expression for the multiple kp above, we choose x and y 
carefully so that their absolute values are less than | . This in turn means 
that k < p (since kp + + l = + so k<-g + j<p). 

The key to the proof, then, is the descent argument. Let’s see how the 
descent works by looking at an example. We will start with a multiple, 
18 • 19, of the prime 19 written as a sum of four squares 

18 • 19 = 342 = 16 2 + 9 2 + 2 2 + l 2 . 

We can reduce each of these squares modulo the number 18 by finding 
numbers in the interval -8 < x < 9 congruent to each of 16, 9, 2, 
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and 1. This gives us a new smaller sum of four squares that is also a 
multiple of 18: 


18 • 5 = (— 2) 2 + 9 2 + 2 2 + l 2 . 

(Here, we first computed (-2) 2 + 9 2 + 2 2 + 1 2 = 90, and then factored to 
get 18 • 5.) 

Now we multiply these two sums together, using the fundamental 
identity of Euler on page 199, and we get 

18 2 • 5 • 19 = (16 2 + 9 2 + 2 2 + 1 2 )(( — 2) 2 + 9 2 + 2 2 + l 2 ) 

= (-32 + 81 + 4 + l) 2 + (144 + 18 - 2 + 2) 2 
+ (32 + 9 + 4 - 9) 2 + (16 - 18 + 18 + 2) 2 
= 54 2 + 162 2 + 36 2 + 18 2 . 

And so, dividing through by 18 2 , we see that 

5 ■ 19 = 3 2 + 9 2 + 2 2 + l 2 

is a smaller multiple of 19 written as a sum of four squares. 

Thus we were able to achieve exactly what we wanted. We be- 
gan with one multiple 18 ■ 19 of the prime 19 written as a sum 
of four squares, and produced a smaller multiple 5 • 19 of 19 also written 
as a sum of four squares. 

In Problem 7.27, you will be asked to continue the descent in this 
example all the way to the bottom, that is, until 19 itself has been 
written as a sum of four squares. In the general proof of the theorem, 
however, it is sufficient to show that a single descent can be made, 
because if you can descend one step, then you can descend another, and 
another, until you necessarily reach the prime p. 

Theorem 7.4 (Lagrange's four squares theorem). Every positive integer 
can be written as a sum of four squares. 

Proof 

It will suffice to show that for a multiple kp of an odd prime p where 
1 < k < p that has been written as a sum of four squares 

kp = a 2 + b 2 + c 2 + d 2 , 

there is a smaller multiple mp — that is, 1 < m < k — such that mp can 
also be written as a sum of four squares. 
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We reduce the squares a 2 , b 2 , c 2 , d 2 modulo k by choosing or, p, y, 8 
in the interval < x < | such that 

a = a, b = p, c = y , d = 8 (mod k), 

and then writing 

mk = a 2 + p 2 + y 2 + 8 2 . 


Since, after multiplying these two sums together using the funda- 
mental identity of Euler, and dividing by k 2 , we will have written mp as 
a sum of four squares, we need only show that the descent has worked, 
that is, that 1 < m < k. 

First, we see that m > 1, since otherwise m — 0, and a — p — y = 8 — 
0, which means that k 2 divides each of a 2 , b 2 , c 2 , d 2 , and so k would 
divide p, but 1 < k < p. 

Next, we show that m < k. Well, what is the largest that m could 
possibly be? This occurs when each of a , p, y, 8 are as large as they can 
be, that is, | . So the largest possible value for m is 

(|) 2 + (|) 2 + (|) 2 + (|) 2 , 

k 

and this occurs only iox a = p = y = 8 = But this case is impossible, 
since it would imply that 

kp = a 2 + b 2 + c 2 + d 2 = a 2 + p 2 + y 2 + 8 2 = k 2 = 0 (mod k 2 ), 

and k would divide p. Thus m< k. 

All that remains is to multiply the original expression 

kp = a 2 + b 2 + c 2 + d 2 


by this new expression 


mk = a 2 + p 2 + y 2 + 8 2 , 


using the fundamental identity of Euler to get an expression for k 2 mp as 
a sum of four squares. The final detail of showing that we can divide 
each of these four squares by k 2 — thus revealing mp as a sum of four 
squares — is left as an exercise in Problem 7.28. This completes the 
proof. ■ 
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Sums of Three Squares 

You might have noticed something a little strange. We went almost 
directly from the theorem of Fermat in Chapter 5, Theorem 5.3, which 
allows us to characterize the numbers that can be written as sums 
of fwo squares, to Lagrange's theorem in this chapter, Theorem 7.4, 
which tells us that all numbers can be written as sums of four squares. 
We never paused anywhere in between to determine which numbers 
can be represented as sums of three squares. There is a perfectly good 
reason for this omission: the corresponding problem for three squares 
is significantly harder than the problems for two or four squares. 

Moreover, we can see why the problem for three squares is harder. 
Recall from Problem 1.16 that no number of the form 8 k + 7 can be the 
sum of three squares simply because squares have to be congruent to 0, 
1, or 4 modulo 8. Thus, for example, 143 = 817 + 7 can’t be the sum 
of three squares. Yet each of its prime factors, 11 = 3 2 + l 2 + 1 1 and 
13 = 3 2 + 2 2 + 0 2 , is a sum of three squares. This means that there can 
be no hope of ever finding an identity analogous to the fundamental 
identities of Fibonacci and Euler that would allow us to represent a 
product (a 2 + b 2 + c 2 ){d 2 + e 2 + f 2 ) as a sum of three squares. 

Legendre attempted a proof of which numbers can be represented by 
three squares, but it was Gauss who gave the first proof in Disquisitiones 
Arithmeticae. In addition to numbers of the form 8 k + 7, it can also be 
seen that any number of the form 4"(8/c + 7) cannot be written as a sum 
of three squares (see Problem 7.32). In fact— and this is what Gauss was 
able to prove — any number not of this form can be written as a sum 
of three squares. However, as we have indicated, the proof is not at all 
straightforward, and we shall not include it. 


Waring's Problem 

We have devoted a considerable amount of time in this book to prob- 
lems involving sums of squares, in part because of the origins of these 
problems in the ancient mathematics of the Greeks, but also because 
of the intense interest in these problems shown by mathematicians 
such as Fermat, Euler, and Lagrange. Mostly we have done so, however, 
because the two main results concerning sums of squares — Fermat’s 
characterization of the numbers that can be written as a sum of two 
squares (Theorem 5.3 and its corollary) and Lagrange’s four squares 
theorem (Theorem 7.4) — are among the most deeply satisfying achieve- 
ments in all of number theory. 

A natural question to ask is whether there are any similar results for 
sums of other powers. For example, what numbers can be written as 
a sum of cubes? Fermat’s last theorem tells us that no cube is a sum 
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of two cubes, but we saw in Problem 1.18 that it is possible for a cube 
to be a sum of three cubes. Might it be also be possible that there is 
a theorem analogous to Lagrange’s four squares theorem saying that 
for some small fixed number c, every number can be written as a sum 
of c cubes? 

In fact, it is fairly easy to guess what the number c should be. To guess 
what the number c should be for sums of cubes, let’s first try to see 
where the “four” comes from in Lagrange’s four square theorem. Since 
8 = 2 2 + 2 2 is less than 3 2 , it is clear that 7, being 1 less than 8, is going to 
require four squares: 7 = 2 2 + l 2 + l 2 + l 2 . Then, roughly speaking, any 
number greater than 7 won’t ever need more than four squares because 
there will be more squares to work with. Now, for cubes, we can see that 
the number 23 plays a similar role. Since 23 is 1 less than 24 = 2 3 +2 3 +2 3 
and 24 is less than 3 3 we can see that 23 is going to require nine cubes: 
23 = 2 3 + 2 3 + l 3 + l 3 + l 3 + l 3 + l 3 + l 3 + l 3 . 

So, it might just be possible— and this is a pretty wild guess at this 
point — that every number n can be written as a sum of nine cubes. Your 
confidence in this conjecture would increase substantially if you were to 
check that except for 23 every positive integer up to 238 can be written 
as a sum of fewer than nine cubes. Then, once again, 239 requires nine 
cubes: 239 = 4 3 + 4 3 + 3 3 + 3 3 + 3 3 + 3 3 + l 3 + l 3 + l 3 . 

In 1770, Edward Waring, who was the sixth Lucasian Professor 
of Mathematics at Cambridge University from 1760 to 1798 (see 
Problem 3.37 for other Lucasian Professors), made the extraordinary 
claim that not only could every number be written as a sum of 9 cubes, 
but every number could also be written as a sum of 19 fourth powers, or 
as a sum of 37 fifth powers, or more generally as a sum of g(k) kth powers 
where 


giX) = [(§)*_ 


+ 2 k - 2. 


Recall that the function LxJ being used in this formula is the floor 
function defined in Chapter 3, and it represents the greatest integer less 
than or equal to x. 

Thus Waring's claim is that for squares 


g( 2) = |_(§) 2 J + 2 2 - 2 = jJJ + 2 = 2 + 2 = 4, 


just as it should be by Lagrange’s four squares theorem. 
For cubes, Waring’s claim is that 


<?( 3 )=[( 1) 3 


+ 2 3 -2= [fj + 6 = 3 + 6 = 9; 
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and for fourth powers, 


g( 4) = [(1) 4 J + 2 4 - 2 = [|J + 14 = 5 + 14 = 19; 
and for fifth powers, 


g(5) = 



+ 30 = 7 + 30 = 37; 


and so on. 

Waring certainly had no proof for his claim, and it wasn’t until 1909 
that the great German mathematician David Hilbert would be able to 
prove that for each k there is in fact some number g(k) — not necessarily 
that given by Waring’s formula, however — such that every integer n can 
be written as a sum of g(k) kth powers. 

Waring’s problem— the name by which this general topic has become 
known — has generated an enormous amount of interest. Perhaps the 
most interesting question is not the obvious one first raised by Waring 
about the value of g(k), because that value is unduly affected by small 
values of n. For example, 23 and 239 each require nine cubes, which is 
why Waring conjectured that ^(3) = 9 in the first place, but once you 
get past these two small values of n, any n > 239 can be written as a sum 
of eight cubes! 

This suggests that a much more interesting function to investigate is 
G(k), where for each k the number G{k) is the smallest number such that 
past some point all positive integers can be written as a sum of G{k) kth 
powers. 

It is now known, for example, that not only are 23 and 239 the 
only numbers that require nine cubes, but that there are only finitely 
many numbers that actually require eight cubes. The rest only need 
seven, or fewer. In other words, “past some point” only seven cubes are 
required! Therefore, G(3) < 7. See Problem 7.33 for a discussion of what 
is currently suspected about the actual value of G(3). 

The number G{k) is known for only two values of k > 1. Since any 
number of the form 8» + 7 cannot be written as a sum of three squares, 
G(2) > 3. Therefore, by Lagrange’s four squares theorem, G(2) = 4. The 
only other value that is known is G(4) = 16. 

However, what is important to note about G(k) is that, while we 
may only have estimates for the value of G{k), these estimates give a 
far better indication of what is going on than does g(k). For example, 
when k = 10, the Waring formula for £(10) says that any number can be 
written as a sum of 1079 tenth powers, but G(10) < 59, so in fact, past a 
certain point, only 59 or fewer tenth powers are needed. When k = 20, 
the formula for £(20) is more than one million, but G(20) < 142. 
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Fermat's Last Theorem 

We began this chapter on Euler and Lagrange with the letter of 1729 
that Christian Goldbach wrote from Moscow to Euler that first opened 
for Euler the world of number theory. We conclude this chapter on the 
two great mathematicians of the eighteenth century with a letter that 
Euler wrote to Goldbach, almost twenty-five years later, in 1753, about 
the famous conjecture that we now call Fermat's last theorem. 

There’s another very lovely theorem in Fermat whose proof he 
says he has found. Namely, on being prompted by the problem in 
Diophantus, find two squares whose sum is a square, he says that 
it is impossible to find two cubes whose sum is a cube, and two 
fourth powers whose sum is a fourth power, and more generally 
that this formula a n + b n = c n is impossible when n > 2. Now 
I have found valid proofs that a 3 + b 3 / c 3 and a 4 + b 4 / c 4 , 
where ^ denotes cannot equal. But the proofs in the two cases are 
so different from one another that I do not see any possibility of 
deriving a general proof from them that a" + b" ^ c n if n > 2. Yet 
one sees quite clearly as if through a trellis that the larger n is, the 
more impossible the formula must be. Meanwhile I still haven’t 
been able to prove that the sum of two fifth powers cannot be a 
fifth power. To all appearances the proof just depends on a 
brainwave, and until one has it all one’s thinking might as well 
be in vain. 

Euler worked on the first case of Fermat’s last theorem — that is, the 
case n = 3 — in the years between 1753 and 1770. His proof is not at all 
simple, and in fact at one particular point his proof was not even quite 
complete, and contained a flaw that was later corrected by Legendre. We 
will save our discussion of Euler’s proof of the case n = 3 for later in the 
book, and for now mention only that Euler based his proof on the use 
of a complex number called a cube root of unity: 

-1 + /V3 

where the number p has the remarkable property that p 3 = 1 (that’s 
why p is called a cube root of unity: because when you cube it, you 
get 1, that is, unity). 

Euler then worked with the number system consisting of complex 
numbers of the form a + bp where a and b are integers. In particular, 
then, Fermat’s equation 


x 3 + y 3 = z 3 
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can be rewritten as 


{x + y)(x + py)(x + p 1 y) = z 3 


(see Problem 7.41). 

We now prove Fermat’s last theorem for the case n = 4, one of the 
two cases that Euler proved, and the only case that Fermat himself 
proved. Actually, we will prove that there is no nontrivial solution to 
the equation 


x 4 + y 4 = z 2 , 

from which it follows immediately that there can be no nontrivial 
solution for 


x 4 + y 4 = z 4 . 

(This is because if x = a, y — b, z = c is a solution for x 4 + y 4 = z 4 , then 
x = a, y = b, z = c 2 is a solution for x 4 + y 4 = z 2 .) 

This proof uses Fermat’s method of infinite descent, although we will 
cast it in a slightly new form in order to show you a variation of what by 
now should feel like a very familiar method. 

Recall that we insert the word “nontrivial” into these statements 
about solutions to equations to avoid uninteresting solutions such as 
x = 43, y = 0, z = 43 for equations like x 4 + y 4 = z 4 . So, a nontrivial 
solution is a solution consisting of positive integers. 


Theorem 7.5. There is no nontrivial solution to the equation 


x 4 + y 4 = z 4 . 


Proof 

We will prove the stronger statement that there is no nontrivial solution 
to the equation 


x 4 +y 4 = z 2 . 

By way of contradiction, assume that z is the smallest positive integer 
such that x 4 + y 4 = z 2 has a nontrivial solution in the integers, and let 
x 4 + y 4 = z 2 be a specific solution in the positive integers. Then x and y 
are necessarily relatively prime. 
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Now we rewrite x 4 + y 4 = z 2 as (* 2 ) 2 + (y 2 ) 2 = z 2 . Therefore, by 
Theorem 1.1, we can assume that x 2 is even, and that 

x 2 = 2 st, y 2 = s 2 -t 2 , z = s 2 + t 2 , 

with s > t, and s and t relatively prime, where one of s and t is odd, and 
one is even. Since squares are congruent to either 0 or 1 modulo 4, then 
it must be s that is odd; otherwise, y 2 is congruent to 3 modulo 4. So we 
write t = 2r, and the first two equations become 

(|) 2 = S r, y 2 — s 2 - (2r) 2 . 

But s and r are relatively prime, so s and r are each squares (note that 
x is even, so \ is an integer). Thus we can write s = u 2 , r = v 2 , and the 
second equation becomes 


(2 v 2 ) 2 + y 2 = (u 2 ) 2 . 

This means we have another Pythagorean triple, so we can use 
Theorem 1.1 once again to write 

2v 2 = 2az, y — a 2 — x 2 , u 2 — a 2 + r 2 , 

with a and r relatively prime (we are using Greek letters sigma and tau 
to play the roles of s and tin Theorem 1.1). But, then, at = v 2 , so a and 
r are both squares, and we can write a = a 2 , r = p 2 . 

Therefore, u 2 = a 2 + r 2 = (a 2 ) 2 + {P 2 ) 2 , and we have 

a 4 + p 4 = u 2 , 

and this is a smaller solution to the original equation than 

x 4 + y 4 = z 2 , 


since u 2 = s < s 2 < z < z 2 . 

But, by assumption, z is the smallest positive integer among all non- 
trivial solutions. This contradiction completes the proof. ■ 


Problems 

7.1 (S) The paragraph leading up to Condorcet’s famous quote that, at the 
moment of his death, Euler “ceased to calculate and to breathe” 
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contains a potential tongue twister: “Euler's eulogy.” Check your 
pronunciation of the following names (either with someone who 
knows the correct pronunciations or in the Pronunciation Guide 
provided at the back of the book): Euler, Aryabhata, Bachet, Erdos, 
Fermat, Gauss, Lagrange, Thabit ibn Qurra, Leibniz, Mersenne, and Zu. 

7.2 * (S) (a) Illustrate the proof of Euler’s theorem, Theorem 7.1, for n = 14 

and a = 5, by first finding a reduced residue system 
a i, ci 2 , , t?0(i4), and then showing that flc?!, ■ ■ ■ , ««</>( i4) 
is also a reduced residue system. 

(b) Verify directly that 5^ (14 ) = \ (mod 14) as predicted by Euler’s 
theorem. 

(c) Find the order of 5 modulo 14. 

7.3 * (S) Mathematicians often use N to denote the set of natural 

numbers — that is, the positive integers. Prove that if f : N N 
is a multiplicative function, then f{ 1) = 1. 

7.4 (S) Illustrate Gauss’s proof that 0 is multiplicative by letting a = 6 and 

b — 5, and verifying that 0(30) = 0(6)0(5) by explicitly showing how 
the eight numbers less than and relatively prime to 30 are solutions to 
the eight systems of congruences that result from 0(6) = 2 and 
0(5) = 4. 

7.5 (S) Complete Gauss’s proof— presented on page 194— that 0 is 

multiplicative by showing that the set of 0(a)0(i>) solutions to the 
system of congruences 

x = r (mod a ) and x = s (mod b) 

is, in fact, a reduced residue system modulo ab; that is, verify the 
following: 

(a) Each solution is distinct modulo ab. 

(b) Each solution is relatively prime to ab. 

(c) Any positive number less than and relatively prime to ab is a 
solution to one of these systems of congruences. 

7.6 (H,S) In his Mathematical Games column in Scientific American in Janurary 

1964, Martin Gardner asked his readers to express the number 64 with 
four 4s, then with three 4s, and finally with two 4s. For example, here is 
one way to write 64 using four 4s: 


64 = (4 + 4) • (4 + 4). 
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(a) Solve Gardner’s problem as he originally intended using 
ordinary mathematical symbols such as parentheses, arithmetic 
operations such as addition and multiplication, as well as 
familiar number-theoretic functions such as the square root and 
factorial functions by finding another way to write 64 with four 
4s, a way to do it with three 4s, and also with just two 4s. 

(b) Then, also express the number 64 with two 4s using Euler’s phi 
function </>(«). 

7.7 For n > 2, </>(«) is even, because if a is a positive integer less than n 
relatively prime to n, then so is n - a (note that \ is not relatively prime 
to n, so a ^ |). Use the observation that the positive numbers less than 
n relatively prime to n come in pairs to prove that for n > 1 the sum of 
the positive integers less than and relatively prime to n is equal to 
\n<i){n). 

For example, for n = 12, we get 1 + 5 + 7 + 11 = 24 = \ • 12 • 4. 

Note, by the way, that we could also tell that 0(n) is always even, for 
n > 2, simply by looking at the first form of the formula for <p(n) on 
page 193. 

7.8 * (S) In Problem 7.7, we presented a property of Euler’s phi function 

that involves adding the positive numbers less than and relatively 
prime to n. 

In Disquisitiones Arithmeticae, Gauss presented a remarkably useful 
formula that involves adding all the numbers <p(d ) where d ranges over 
the divisors of n. What Gauss discovered was that this sum is the 
number n itself. For example, for n = 12, the divisors d are 1, 2, 3, 4, 6, 
and 12, and so the sum is 

0(1) + 0(2) + 0(3) + 0(4) + 0(6) + 0(12) = 1 + 1 + 2 + 24-2 + 4 = 12. 

Verify Gauss’s property for n = 36. Then note that given a divisor of 
36 such as d — 4, we have 0(^) = 0(9) = 6, and there are exactly six 
numbers 4. 8. 16, 20, 28, and 32 from 1 to 36 whose gcd with 
36 equals 4; or, as another example, if d — 12, then 0(f§) = 0(3) = 2, 
and there are exactly two numbers 12 and 24 from 1 to 36 whose gcd 
with 36 equals 12. 

Then, use this idea — don’t forget to justify it — to prove Gauss’s 
property; that is, prove in general that 

^ = n - 

d\n 
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7.9 * (S) We have seen that 2 is a primitive root for the prime 19. However, 

2 is not a primitive root for the prime 7, since 2 3 = 1 (mod 7). But we 
can see that the prime 7 has 3 as a primitive root, since we can compute 
the powers of 3 to get 

3, 3 2 = 2, 3 3 = 6, 3 4 = 18 = 4, 3 5 = 12 = 5, 3 6 = 15 = 1 (mod 7). 

Find the least prime p > 10 for which 2 is a primitive root. Next find 
the least prime q > 10 for which 2 is not a primitive root, and then find 
a primitive root for the prime q. 

7.10 (S) Find a primitive root for n — 10; for n = 18; and for n = 25. 

7.11 * (S) An idea from the proof of Theorem 7.2 is that once you know one 

primitive root a for a prime p, then you can find all the primitive roots, 
because a number a r will be a primitive root of p if and only if r is 
relatively prime to p - 1. 

Use this idea to find all the primitive roots for the prime p = 13. 

7.12 (H,S) Find the least primitive root a for the prime 41. Then find the order 

of r for each positive integer r less than a. 

7.13 (H,S) Let p be a prime. Prove that if two numbers b and c have orders r and 

s modulo p, respectively, and if r and s are relatively prime, then the 
product a = be has order rs modulo p. 

7.14 (H,S) Use Lagrange’s theorem, Theorem 6.5, to prove that if p is prime 

and d\(p - 1), then the congruence 

x d = 1 (mod p) 

has exactly d solutions. 

7.15 ★ (S) Use Legendre’s method of writing p — 1 in terms of its prime 

decomposition to find the four primitive roots of p = 13. Compare your 
answer with that of Problem 7.11. 

7.16 (S) In Theorem 7.2, we showed that if a prime p has a primitive root, then 

it has exactly <p(p - 1) primitive roots. 

Prove that if an integer n has a primitive root, then it has exactly 
4>{ct>(ri)) primitive roots. Then verify this statement for n — 9. 

Problems 7.17-7.19 investigate a single important question: which 
composite integers have primitive roots? 
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7.17 (H,S) We can see that the integer 4 has a primitive root, because 1 and 3 
are the only positive integers less than and relatively prime to 4; so 3 is 
a primitive root of 4 since 3 2 = 1 (mod 4). On the other hand, we can 
also see that the integer 8 does not have a primitive root, since 0(8) = 4 
and each odd integer 1, 3, 5, and 7 has order 1 or 2 modulo 8, and 
therefore cannot be a primitive root for n — 8. 

Prove in general that the integer 2 k , for k > 2, does not have a 
primitive root. 

7.18 (H,S) (a) Verify that the integer 15 does not have a primitive root by 

noting that 0(15) = 0(3)0(5) = 2-4 = 8, and showing that for 
each of the 8 numbers a less than and relatively prime to 15, 
a 4 = 1 (mod 15). 

(b) Prove that if m and n are two relatively prime integers, both 
greater than 2, then their product mn does not have a primitive 
root. 

7.19 (H,S) Prove that the only integers greater than 1 that have primitive roots 

are 2, 4, and integers of the form p a or 2 p a with p an odd prime. 

7.20 In the text, we wrote 133 as a sum of four squares by using the prime 
decomposition of 133, and applying the fundamental identity of Euler 
on page 199 where 7 = 2 2 + l 2 + l 2 + l 2 and 19 = 4 2 + l 2 + l 2 + l 2 . 

Write 133 as a sum of four squares, again by using Euler's identity, 
but instead use 7 = 2 2 + l 2 + l 2 + l 2 and 19 = 3 2 + 3 2 + l 2 + 0 2 . 

7.21 * Verify the fundamental identity of Euler for four squares on page 199 

by expanding both sides and comparing terms. 

7.22 ★ (S) Find the quadratic residues, and the quadratic nonresidues, for the 

prime p = 23 in two ways: 


(a) by computing l 2 , 2 2 , . . . , 11 2 modulo 23; 

(b) by using the primitive root 5 for the prime 23, and getting the 
quadratic residues as even powers of 5 and the quadratic 
nonresidues as the odd powers, in the list 5, 5 2 , 5 3 , . . . , 5 22 . 


Also, note that, since 23 is of the form 4/? + 3, each quadratic residue 
a you find corresponds to a quadratic nonresidue p - a, and vice versa. 

7.23 (S) Use the list of quadratic residues and quadratic nonresidues for the 
prime 23 from Problem 7.22, and the method used by Euler in the 
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proof of Theorem 7.3, to find a solution for the congruence 
x z + y 2 = - 1 (mod 23). 

7.24 (S) (a) Find the quadratic residues and the quadratic nonresidues for 

the prime p = 13. 

(b) Verify that 13 does not have the property that each quadratic 
residue a corresponds to a quadratic nonresidue p - a. 

(c) What property holds for the prime p = 13 instead? 

7.25 (H,S) We know by Theorem 7.3 that there is a solution to the congruence 

x 2 + y 2 = -1 (mod 13), 

yet any solution differs in a very fundamental way from solutions to 
this same congruence for primes of the form 4n + 3. Explain this 
difference. 

7.26 (H,S) On page 195 we illustrated the descent argument to be used in the 

proof of Lagrange’s theorem, Theorem 7.4, by starting with the 
multiple 18 • 19 of the prime 19 written as a sum of four squares as 
16 2 + 9 2 + 2 2 + l 2 , and then producing the smaller multiple 5 ■ 19 also 
written as a sum of four squares. 

Illustrate this descent argument once again by writing the multiple 

18 • 19 of the prime 19 as a sum of four squares as 12 2 + 10 2 + 7 2 + 7 2 , 
and then using the descent argument to find a smaller multiple of 

19 also written as a sum of four squares. 

7.27 (S) The whole point of a descent argument such as that used in proving 

Lagrange’s theorem, Theorem 7.4, is that it is sufficient to show that if a 
multiple kp of a prime p is a sum of four squares, then a smaller 
multiple of p also has this same property. Illustrate this by continuing 
the example used in the text on pages 204-5, where we began with the 
multiple 18 • 19 of the prime 19 as a sum of four squares, and then 
found a smaller multiple, 5 • 19, that was also a sum of four squares. In 
other words, continue with the descent in this example until you have 
the prime 19 written as a sum of four squares. 

7.28 (H,S) Show that in the final step of the proof of Lagrange’s four squares 

theorem, Theorem 7.4, each of the four squares that result from using 
the fundamental identity of Euler to multiply 


kp = a 2 + b 2 + c 2 + d 2 
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by 


mk = a 2 + p 2 + y 2 + 8 2 


is divisible by k 2 . 

7.29 (S) There is a remarkable formula, stated both by Legendre and Gauss, 
and proved by Carl Jacobi, that gives the total number of ways of 
representing a number n as a sum of two squares to be 


4(di - dz). 


where d\ is the number of divisors of n of the form 4k + 1, and rf 3 is the 
number of divisors of the form 4k + 3. 

For example, the number n = 45 = 3 2 • 5 has six divisors, and four of 
them — 1, 5, 9, and 45 — are of the form 4k + 1, and two of them — 3 and 
15 — are of the form 4k + 3; so the total number of representations for 
45 as a sum of two squares is 4(4 — 2) == 8. This corresponds to there 
being essentially only one way to write 45 as a sum of two squares: 

45 = 3 2 + 6 2 , and yet, because there are eight variations of this sum, 
which you can get by changing signs and order: 45 = (±3) 2 + (±6) 2 = 
(±6) 2 + (±3) 2 . 

Use this formula to find the total number of representations for 85 as 
a sum of two squares, and also the number of essentially different ways 
of writing 85 as a sum of two squares. 

Then use this formula to find the smallest positive integer n that can 
be written as a sum of two squares in three essentially different ways. 

7.30 (S) Jacobi also found a formula for the total number of representations for 
a number n as a sum of four squares. This formula is given in terms of 
a(n), where a(n) is the sum of the positive divisors of n. 

For n odd, the number of representations is 8 a{n). For n even, we 
write n = 2 r m where m is odd, and the number of representations for n 
is 24 a(m). 

For example, if n = 21 , then 21 = 3 • 7, so 
a(21) = 1 + 3 + 7 + 21 = 32, and there are 8 • 32 = 256 representations. 
One of these representations, 21 = 3 2 + 2 2 + 2 2 + 2 2 , has 64 variations 
because there are four possible positions for the 3, and 2 4 = 16 possible 
changes of signs. The only other essentially different representation is 
21 = 4 2 + 2 2 + l 2 + 0 2 , and this has 192 variations since there are 
41 = 24 orders for the four numbers 4, 2, 1, and 0, and 2 3 = 8 possible 
changes of signs. 
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Use Jacobi's formula to find the total number of ways to represent 28 
as a sum of four squares, and hence the number of essentially different 
ways of writing 28 as a sum of four squares. 

7.31 ★ (H,S) Perfect numbers can be defined in terms of the function a(n), 

where o(ri) is the sum of the positive divisors of n. Thus a positive 
integer n is perfect if a(n) = 2 n. 

(a) Find a formula for o(ri) using the canonical prime decomposition 

n = Pi' 1 Pi ■ ■ ■ Pk- 

(b) Use the formula for a(n) from part (a) to prove that the function 
a(n) is multiplicative; that is, prove that, for any two positive 
integers mand n that are relatively prime, 

o(mn) — a (m)a{n). 

(c) Use the function a(n ) to verify that 8128 is a perfect number. 

(d) Repeat the proof of Theorem 5.4 using the function a(ri). 

7.32 (S) Show that no number of the form 4"(8 k + 7) can be a sum of three 

squares (see Problem 1.16). 

7.33 (S) Linnick proved, in 1942, that there are only finitely many numbers 

that cannot be written as a sum of seven cubes. Thus G(3) < 7. It is 
believed that 454 is the largest number that requires eight cubes. Write 
454 as a sum of eight cubes. 

It is strongly suspected that 8042 is the largest number that requires 
seven cubes: 8042 = 16 3 + 12 3 + 10 3 + 10 3 + 6 3 + l 3 + l 3 , even though 
it is not yet known whether G(3) < 6. The largest number known to 
require six cubes is 129 074, and it has been conjectured that in fact 
G(3) = 4, and that 7 373 170 279 850 is the largest number that cannot 
be written as a sum of four cubes. 

7.34 (S) Edward Waring's claim that all numbers can be represented with 

9 cubes, 19 fourth powers, 37 fifth powers, and so on, is not nearly as 
dramatic as it might first appear. The cubes are 1 , 8, 27, ... , and so it is 
clear that the number 7 requires 7 cubes, and similarly 15 will then 
require 8 cubes, and then 23 will require 9 cubes. But the next number 
in the arithmetic sequence 7, 15, 23, . . . , is 31, which requires 
not 10 but only 5 cubes, since 31 > 27, and we can write 
31 = 27 + 1 + 1 + 1 + 1. 
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Use a similar argument, and similar arithmetic sequences in each 
case, to find the first number that requires 19 fourth powers, the first 
number that requires 37 fifth powers, and the first number that 
requires 73 sixth powers. 

Then show in general that 


g(k) > 



- 2 , 


by finding, for each k, an explicit integer n that requires |_ + 2 k - 2 
kth powers. 


7.35 (S) By experimenting with a few values of k and p in the following 

expression for the sum of all the kth powers modulo a prime p 

1* + 2 k + ■ ■ ■ + (p - 1)* 

make a conjecture about the value of this sum modulo p. Then prove 
your conjecture. 

For example, for the prime p = 3 and k = 2, this sum is l 2 + 2 2 = 5 = 
2 (mod 3); and for k = 3, this sum is l 3 + 2 3 = 9 = 0 (mod 3). 

7.36 (H,S) In Problem 5.32, we asked which positive integers can be written as 

a difference of two squares. Prove that every positive integer can be 
written as a sum or difference of three squares. For example, 

60 = 7 2 + 6 2 - 5 2 . 


7.37 (H,S) Prove that every positive integer can be written as a sum or 

difference of five cubes. 

7.38 (H,S) Fermat’s last theorem says that the equation 

x n + y n = z” 

has no nontrivial solutions for n > 2, and this conjecture first came to 
Fermat, while reading Diophantus, as a natural generalization of the 
Pythagorean equation 

x 2 + y 2 = z 2 . 

There is one other quite natural way to generalize the Pythagorean 
equation: increase the number of terms to match the power as the 
power increases; that is, the general equation becomes 

x” + x” + - • . + j£ = z", 


for n > 1 . 
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We saw in Problem 1.18 that, in contrast to the equations of Fermat’s 
last theorem, for n = 3 this equation does have nontrivial solutions, 
and among these solutions, the three smallest are 

3 3 + 4 3 + 5 3 = 6 3 , l 3 + 6 3 + 8 3 = 9 3 , and 7 3 + 14 3 + 17 3 = 20 3 . 

In his same letter to Goldbach which dealt with Fermat’s last 
theorem, and which was previously quoted, Euler wrote 

But since the equation aa + bb = cc is possible, and so also is 
possible, a 3 + b 3 + c 3 = d 3 , it seems to follow that this, 
a 4 + b 4 + c 4 + d 4 = e 4 , is possible, but up till now I have been able 
to find no case of it. 

The smallest solution to this equation, for n = 4, was discovered in 
1911, 

30 4 + 120 4 + 272 4 + 315 4 = 353 4 , 
and the smallest, for n = 5, is 

19 s + 43 s + 46 s + 47 s + 67 s = 72 s . 

This is the highest power for which a solution is known! Verify each of 
these solutions. 

In 1769, Euler made a conjecture that was every bit as bold as 
Fermat’s famous conjecture about his last theorem. Euler claimed that 
no nth power could ever be written as a sum of fewer than n other nth 
powers. In other words, a cube would always require three cubes, a 
fourth power would require four fourth powers, and so on. 

This conjecture lasted for almost two hundred years, but in 1968, 
Lander and Parkin discovered a counterexample, 

27 s + 84 s + 110 s + 133 5 = 144 s . 

Verify this counterexample to Euler’s conjecture. 

Then, in 1987, N. Elkies found infinitely many counterexamples to 
Euler’s conjecture for n = 4, including 

95 8 00 4 + 217 519 4 + 414 5 60 4 = 42 2 481 4 . 

Verify this counterexample. 

See the hint section if your calculator will not compute integers as 
large as 422 481 4 . 
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7.39 (S) There are two instances where identities are known which provide 
infinitely many solutions to the equation 

x? + 4 + - ■ • + *;; = z" 

discussed in Problem 7.38. 

First, verify the identity 

(9x 4 ) 3 + (3xy 3 - 9x 4 ) 3 + (y 4 — 9x 3 y) 3 = (y 4 ) 3 . 

Next, show that there are infinitely many choices for x and y that make 
each term in this identity positive. 

Then, for several values of x and y use this identity to produce 
solutions to the equation 


xf + xf + xf = Z 3 . 

Be sure to verify your solutions. 

There is also an identity that works for fifth powers, 

(75y 5 - x 5 ) 5 + (x 5 + 25y s ) s + (x s - 25y s ) s + (10x 3 y 2 ) 5 + (50xy 4 ) 5 
= (x 5 + 75y 5 ) 5 . 


Verify this identity, and show that there are infinitely many choices of 
x and y that make each term in this identity positive. 

Next, find the two smallest solutions to the equation 

X 3 + xf + x| + xf + xf = z 5 

produced by this identity, and then actually verify the smaller of these 
two solutions. 

Finally, find the smallest value of y for which there is more than one 
value of x that yields a solution to this equation. 

No analogous identity is known for fourth powers, which, at least in 
part, explains why there are only sporadic solutions known for the 
equation 


x? +x; +x? + x? = z* . 


7.40 (H,S) Show that 


- 1 + i \/3 ^ 3 


- i; 


that is, verify that p = 1 +' x/3 is a cube root of 1. 
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Next, verify that 


, — 1 — -i-n/3 

and then show that p 2 is also a cube root of 1 — that is, show that 
(p 2 ) 3 = 1. Thus, since 1 is obviously a cube root of 1 (since l 3 = 1), the 
three “cube roots of unity” are 1, p, and p 2 . 

Show that 


1 + p + p 2 = 0. 

This same thing is true in general for any n > 1. There are n nth roots 
of unity — that is, there are n complex numbers z such that z" = 1. 
Moreover, these n nth roots of unity can be generated by one of the 
roots of unity, so we can write the n nth roots of unity as 

1, cr, a , . . . , o 

These n nth roots of unity also satisfy the equation 
1 +CT To- 2 + • • ■ +0-" -1 = 0. 

Verify this for the case n = 4 by finding a complex number a that has 
the property that each of the four numbers 1, a , a 2 , cr 3 is a fourth root 
of 1; then also show that 1 + a + a 2 + a 3 = 0. 

7.41 Verify that if p = , then the equation 

X 3 + y 3 = Z 3 


can be rewritten as 


(x + y)(x + py)(x + p 2 y) = Z 3 . 

(So, here again, p is a cube root of 1, that is, p 3 = 1.) 

7.42 (H,S) One of several famous formulas that go by the name Euler’s 
formula is 

e lz — cos z + i sin z, 

a formula that provides a highly unexpected relationship between the 
exponential function and the trigonometric functions. This formula 
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emerges quite easily, and quite miraculously, from the power series 
expansions of these three functions. 

In particular, by letting z — n, this formula becomes 

e in = -1, 


which can be rewritten as 


e in + 1=0, 

and thus include what are sometimes called the five most important 
numbers in mathematics, and for this reason, the stunningly beautiful 
formula e ln + 1 = 0 is sometimes called Euler’s formula. 

Euler’s formula — that is to say, the first, more general formula 
above — gives us a good way to think about roots of unity. For example, 
for the cube root of unity p in Problems 7.40 and 7.41, we can write 
p = ei 7 *' , and it should immediately be clear from Euler’s formula why 
p 3 = 1, and why p 2 is also a cube root of unity. 

Use Euler’s formula to find a sixth root of unity — other than ±1 — in 
the form a = a + bi, and verify directly using this form that a 6 = 1 . 
Then find all six sixth roots of unity, 1, a, a 2 , cr 3 , a 4 , and ct s , and verify 
directly that 


l+CT+a 2 +CT 3 +(T 4 +CT 5 =0. 

Finally, explain how you can know without any direct calculation at 
all that this sum has to be 0. 

7.43 (H,S) Another famous formula that is also called Euler’s formula is one he 
published in 1752 that relates the number of vertices (u), edges (e), and 
faces ( f) of polyhedra: 


v - e + f =2. 

For example, a cube has eight vertices, twelve edges, and six faces, and 
we see that 8 - 12 + 6 = 2. 

Euler thought he had proved this formula, but his procedure does not 
work in all cases. Legendre eventually proved Euler’s formula in 1794. 
We can easily prove Euler’s formula using the language of graphs. The 
five regular polyhedra (that is, polyhedra whose faces are all identical 
regular polygons) are drawn in Figure 7.3 as graphs in the plane. Note 
that the graph of the cube still has eight vertices and twelve edges, but 
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Figure 7.3 


the six faces of the cube are now the six regions in the graph (one region 
is the outside infinite region). 

For the graph of a polyhedron, therefore, we still have vertices ( n ) 
and edges ( m ), but instead of faces we now have regions (r). So, Euler’s 
formula for connected graphs drawn in the plane becomes 

n-m + r = 2. 

This formula is almost trivial to prove by induction since if you 
start with a single vertex, and no edge (the smallest possible graph), 
then you have one infinite region, so 1 - 0 + 1 = 2. Then, as you 
draw any connected graph starting with a single vertex, at any stage 
in the drawing process you always either add an edge connecting to a 
new vertex, or add an edge connecting two existing vertices (thereby 
splitting an existing region into two new regions); in either case, you do 
not change the value of n-m + r, since in one case you add an edge and 
also add a vertex, while in the other case you add an edge and also add a 
region. Thus when you have finished drawing the graph, n-m + r still 
equals 2. 

It has been known since Euclid that there are exactly five regular 
polyhedra and because Plato wrote about these five polyhedra exten- 
sively in his dialogue Timaeus, these five polyhedra are now known as 
the five Platonic solids : the tetrahedron, the cube, the octahedron, the 
dodecahedron, and the icosahedron having, respectively 4, 6, 8, 12, and 
20 faces. 

In fact, we can use Euler’s formula to give a very simple proof of the 
remarkable fact that there are only five of these regular polyhedra. More 
significant for us in the context of studying number theory, however, 
is the fact that the proof we are about to give is fundamentally a 
Diophantine argument. 

Now, suppose we have a regular polyhedron with d edges coming 
together at each vertex. Since this is also true in the graph of the 
polyhedron, we see that m = ~ (each edge joins two vertices). Suppose 
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also that each regular polygonal face of this polyhedron has s sides; 
then in the graph each region has s sides, and we see that m = y (each 
edge borders two regions). 

Use these two facts and Euler’s formula to conclude that 

1 1 1 _ 1 

d s 2 m 

Then show that this Diophantine equation has exactly five solutions, 
each solution corresponding to one of the five Platonic solids. 
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The Young Gauss 

The nineteenth-century philosopher Arthur Schopenhauer said 

talent hits a target no one else can hit; genius hits a target no one 
else can see. 

Many great mathematicians — Euclid, Fermat, Euler, Lagrange — cer- 
tainly had more than their fair share of talent, and each made enormous 
and lasting contributions to mathematics, but Carl Friedrich Gauss was 
in a class by himself, and his genius showed itself at a very young age. 

What Gauss accomplished while he was still a teenager is as- 
tonishing. The ancient Greeks had been able to construct, using 
straightedge and compass, regular polygons with n sides where n = 
3, 4, 5, 6, 8, 10, 12, 15, and 16. Gauss, at the age of nineteen, discov- 
ered that a regular seventeen-sided polygon can also be constructed 
using a straightedge and compass, and in general explained the strange 
gaps at n = 7, 9, 13, and 14 by proving that a regular polygon with n 
sides can be constructed with a straightedge and compass if and only if 
in the canonical prime factorization of n, where 

n — 2 m p“' ■ ■■ p a k \ 

each prime p,- is a Fermat prime, and each a* = 1. 

During this same year, 1796, he wrote in the diary he was to keep for 
the next eighteen years, 

EYPHKA! num — A + A + A , 

that is: eureka, every number is the sum of three triangular numbers. 
This, in turn, is equivalent to the statement that every number of the 
form 8 m + 3 is a sum of three odd squares (see Problem 8.1). 

One target that Gauss could see, perhaps as early as when he was 
fourteen years old, is an extraordinary result that we now call the prime 
number theorem. He conjectured that for a number n, the number of 
primes from 1 to n is approximately where In n is the natural 
logarithm of n. This celebrated theorem was eventually proved in 1896, 
independently, by Hadamard and de la Vallee Poussin. 
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Figure 8.1 Carl Friedrich Gauss, 
1777-1855. 

But the crowning achievement of the youthful Gauss that we will 
concentrate on in this chapter is quadratic reciprocity. Euler had been 
aware of the essential nature of what we now call the law of quadratic 
reciprocity based on extensive calculations, but any kind of general proof 
eluded him. Legendre could almost prove it, but not quite. The first 
proof of this result was given by Gauss when he was nineteen. 

He included this proof in his great book on number theory, Disqui- 
sitiones Arithmeticae, which appeared in 1801, when Gauss was twenty- 
four, and which also included Theorem 3.4, the fundamental theorem of 
arithmetic. Only two years earlier, he had completed his doctoral thesis 
on the fundamental theorem of algebra. By the end of the year Gauss 
would be famous throughout Europe for having correctly predicted 
where to find the recently discovered asteroid Ceres in the mysterious 
gap between the orbits of Mars and Jupiter, and which had been lost 
after only a few sightings before it passed from view behind the sun. 
Gauss was able to accomplish this feat because, when he was seventeen, 
he had devised a new method for fitting a curve to a limited set of 
data points — we now call this the method of least squares, because this 
statistical technique finds the “curve of best fit” by minimizing the 
sum of the squares of the distances from the points in a given set to a 
curve. 

During his life Gauss would give seven different proofs of the law 
of quadratic reciprocity, just as he would give four different proofs 
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for the fundamental theorem of algebra. His first proof of the law of 
quadratic reciprocity was quite difficult, and there is little doubt that 
Gauss felt the need to look for the “real" proof, that is, in the words of 
the twentieth-century mathematician Paul Erdos, to look for the proof 
from the Book (see page 36). 

By 1808, Gauss had discovered a property of quadratic residues that 
no one else had seen. This property, which we now call Gauss’s lemma, 
provided him with a proof that was far simpler than his original proof of 
one of the most important theorems in number theory. We begin with 
a discussion of quadratic residues, but our goal in this chapter is a proof 
of this great theorem, the law of quadratic reciprocity. 


Quadratic Residues 

Euler gradually became aware of quadratic reciprocity because, from 
the very beginning, he was trying to reproduce Fermat’s work on the- 
orems such as Theorem 5.1 , which tells us which odd primes p can be 
written as 


p = x 2 + y 2 . 

Fermat also had theorems — but left no proofs — for which odd primes 
have the form 


p = x 2 + 2 y 2 

and which odd primes have the form 

p = * 2 + 3y 2 

(see Problem 8.7). Thus, inevitably, Euler became interested in the 
general question of which odd primes have the form 

p = x 2 + ny 2 


for a given positive integer n. 

We saw that the key step in the proof of Theorem 5.1 was showing 
that when p is of the form An + 1, then there are integers x and y 
such that p\(x 2 + y 2 ); and this step depended on knowing that, under 
this hypothesis, -1 is a quadratic residue modulo p. The term quadratic 
reciprocity originally came into use to signal the fact that, for example, 
the question of whether p \{x 2 + 1 • y 2 ) depends in a reciprocal way on the 
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question of whether the quadratic congruence 

x 2 = -1 (mod p) 

has a solution. The meaning of the word reciprocity will change for us 
once we actually see the law of quadratic reciprocity (Theorem 8.6). 

We introduced quadratic residues in Chapter 7 in order to be able to 
prove Theorem 7.3, which we needed at the time as a steppingstone for 
Lagrange’s four squares theorem. Let’s recall the definition and the basic 
properties. 

For a prime p, a number a that is relatively prime to p is a quadratic 
residue of p if the congruence 

x 2 = a (mod p) 

has a solution; otherwise, a is a quadratic nonresidue. So quadratic 
residues are just the nonzero “squares” modulo p. 

Moreover, note that if a is a quadratic residue for an odd prime 
p, and if b is one solution to this congruence, then -b is another 
solution. Hence, by Lagrange’s theorem, Theorem 6.5, this congruence 
has exactly two solutions, and so half the numbers from 1 to p - 1 are 
quadratic residues and half are quadratic nonresidues. Note also that, 
in this definition, we are specifically excluding 0 from being either a 
quadratic residue or a quadratic nonresidue. 

An extremely useful way to think of quadratic residues is in terms of 
primitive roots. Since any prime p has a primitive root r by Theorem 
7.2, and the numbers 1, 2, . . . , p - 1 are congruent to the numbers 
r, r 2 , ... , r p ~ l , in some order, it is clear that the quadratic residues 
of p — that is, the “squares” — correspond to the even powers of r: 
r 2 , r 4 , r 6 , . . . , r p_1 ; and the quadratic nonresidues correspond to the 
odd powers: r. r 3 , r s , . . . , r p ~ 2 . This is another way of explaining why 
half the numbers from 1 to p - 1 are quadratic residues and half are 
quadratic nonresidues. 

This also explains the following two properties of quadratic residues: 

(i) the product of two quadratic residues, or of two quadratic 
nonresidues, is a quadratic residue; 

(ii) the product of a quadratic residue and a quadratic nonresidue is 
a quadratic nonresidue; 

because (i) the sum of two even exponents is even — or the sum of two 
odd exponents is even; and (ii) the sum of an even and an odd exponent 
is odd. 


Gauss 


231 


The Legendre Symbol 

Adrien-Marie Legendre was among a small group of very influen- 
tial mathematicians whose careers spanned the French Revolution. 
Legendre published his most successful text Elements de geometrie in 
1794, though his primary mathematical contributions were in analysis, 
applied mathematics and, of course, as we have already seen, in number 
theory. 

In 1798, Legendre introduced an extremely convenient symbol for 
working with quadratic residues. Naturally, we now call this symbol the 
Legendre symbol. For a number a not congruent to 0 modulo a prime p, 
we define the Legendre symbol (|) to be 

/ a\ j 1 . if a is a quadratic residue of p; 

\p J [ - 1 , if a is a quadratic nonresidue of p. 

Note that if r is a primitive root of p, and a = r k (mod p), this is 
equivalent to saying (|) = (-1)^, because when k is even, (-1)* = 1, 
and when k is odd, (-1)* = -1. 

The Legendre symbol does need to be used with some care, however, 
because it is at first sometimes easy to mistake it for a fraction, which 
it is not. It is merely a useful symbol for recording whether a number 
a is, or is not, a quadratic residue modulo a prime p. Fortunately, the 
context usually makes clear whether an expression is a Legendre symbol 
or a fraction. For example, we could use the Legendre symbol in the 
following way to show that 6 is a quadratic residue modulo 19: 



and neither equality would make any sense at all for fractions. Note, 
also, how nicely this expression is an extremely efficient way of saying 
that 6 is a quadratic residue modulo 19 because 6 is congruent to 25 
modulo 19 and 25 is a square. 

Properties (i) and (ii) of quadratic residues above can be restated as 
the following multiplicative property for the Legendre symbol: 



This is because if a and b are two quadratic residues, then ab is a 
quadratic residue, and the equation just states that 1 = 1 • 1; or, if a 
and b are two quadratic nonresidues, then ab is a quadratic residue, and 
the equation is just 1 = (—!)• (-1); and, if one of a and bis a quadratic 
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residue, and the other is a quadratic nonresidue, then ab is a quadratic 
nonresidue, and the equation is either -1 = (l)-(-l) or -1 = (-l)-(l). 

It is this multiplicative property that makes the Legendre symbol so 
useful for computation. Here is an illustration of the way in which this 
property will be used — a straightforward computation (since we already 
know that 6 is a quadratic residue moulo 19) showing that 5 is also a 
quadratic residue modulo 19: 



In other words, what the Legendre symbol, and especially its 
multiplicative property, allows us to do is represent tedious arguments 
such as 

since 6 is a quadratic residue modulo 19, and 4 = 2 2 is also a 
quadratic residue modulo 19, then their product 24 is also a 
quadratic residue modulo 19; but 24 is congruent to 5 modulo 19, 
so we conclude that 5 must be a quadratic residue modulo 19 

in a simple and easily computational manner. 


Euler's Criterion 

How can we tell whether a number a is a quadratic residue modulo a 
given prime p? For small primes, we can compute l 2 , 2 2 , . . . , , 

and see if a shows up. For large primes, this method isn’t very practical. 

In 1755, Euler discovered a criterion that, at least in principle, gives 
us a good way to tell whether a number a is a quadratic residue for a 
prime p, or not. Take, for example, the prime p = 13; then, the quadratic 
residues are 1, 4, 9, 4 2 = 3, 5 2 = 12, and 6 2 = 10 (mod 13); that is, the 
quadratic residues are 1, 3, 4, 9, 10, 12, and the quadratic nonresidues 
are 2, 5, 6, 7, 8, 11. 

Now let’s raise each of these numbers to the power and see what 
happens. By Fermat’s little theorem, Theorem 5.2, 

^ a ^ — a p ~ l - 1 (mod p), 

for any a relatively prime to p, so we know that each of these numbers 
flV will be either 1 or -1. 

For the first three quadratic residues, we get 


l 6 = 1, 3 6 = 729 = 1, 4 6 = 4096 = 1 (mod 13), 
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and then, it immediately follows that for the last three quadratic 
residues we also get 

12 6 = 10 6 = 9 6 = 1 (mod 13), 

since 12 = -1, 10 = -3, and 9 = -4 (mod 13). 

For the first three quadratic nonresidues, we get 

2 6 = 64 = -1, 5 6 = 15 625 = -1, 6 6 = 117 644 = -1 (mod 13), 

so we also have 

ll 6 = 8 6 = 7 6 = -1 (mod 13), 

for the next three, since 11 = -2, 8 = -5 and 7 = - 6 (mod 13). 

Thus, as this example so clearly suggests, what Euler discovered is 
that we can tell if a number a is a quadratic residue by computing a Q 
(mod p). If we get +1, then a is a quadratic residue; if we get -1, then a 
is a quadratic nonresidue. This test to determine whether a number a is 
a quadratic residue is called Euler’s criterion. 

Theorem 8.1 (Euler's criterion). Let p be an odd prime, and let a be 
relatively prime to p. Then 

= a Q (mod p). 

Proof 

First, we note, by Fermat's little theorem, Theorem 5.2, that ( a V j = 

a p ~ l = 1 (mod p). so a V is a solution to the congruence x 2 = 1 

(mod p).Thus, = l,or a 2 ^ = — 1 (mod p). 

Next, let r be a primitive root of p, and write a = r k (mod p). If a is a 
quadratic residue of p, then k is even, so 

a ^ = (r k )^~ = ( r ^) P = 1 (mod p), 

again, by Fermat’s little theorem. 

On the other hand, if a is a quadratic nonresidue of p, then k is odd, 
and flV = -1 (mod p), since if a V = 1 (mod p), we would have 

1 = a ^ = (r k )^~ = r k ' E ^ 


(mod p), 
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so p - 1 1 k ■ ^ (since r is a primitive root of p), which is impossible since 
k is odd. This completes the proof. ■ 


Gauss's Lemma 

In 1808, Gauss discovered another very effective criterion to tell 
whether a number is a quadratic residue for a prime p. This criterion, 
now called Gauss's lemma, enabled him to give a far less complicated 
proof of the law of quadratic reciprocity than the one he had been able 
to give when he was eighteen. 

Here is how Gauss’s lemma works. For a number a relatively prime to 
p, reduce the numbers in the list 


to their residues from — ^ to ^ • 

For example, if p — 13, then, for a = 3, the list 


3, 2-3, 3-3. 4-3, 5-3, 6-3 


becomes 


3, 6, -4, -1, 2, 5. 

Note that each of the six integers 1, 2, 3, 4, 5, 6 from 1 to ^ occurs 
exactly once in this list with either a plus or a minus sign. Gauss then 
tells us to count the number of minus signs; in this case there are two, 
and so, letting n = 2, we do the following computation (since a = 3): 

and we get 1, as we should, since 3 is a quadratic residue of 13. 

Let’s try it for a = 5, a quadratic nonresidue of 13. The list 

5, 2-5, 3-5, 4-5, 5-5, 6-5 


becomes 


5, -3, 2, -6, -1, 4. 

Again, each of the six integers from 1 to 6 occurs exactly once with 
either a plus or a minus sign. This time there are three minus signs, so 
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we let n = 3, and 



= (-!)" = (-l) 3 = -l. 


exactly as expected, since 5 is a quadratic nonresidue. 


Theorem 8.2 (Gauss's lemma). Let p be an odd prime , and let a be 
relatively prime to p. Let n be the number of numbers in the list 


a , 2a, 3a, . . 


P~ 1 


whose residues from - to are negative. Then 


a 

P 


= (-!)”• 


Proof 

We can argue, as we did in Chapter 6 in the proof of Fermat’s 
little theorem, Theorem 5.2, that the numbers a, 2a, 3a, ... , are 
incongruent modulo p, but in fact these numbers have the stronger 
property that we pointed out in the examples above — namely, it is also 
impossible for two of these numbers to have the “same” residue with 
opposite signs, for if ra = - sa (mod p), then (r + s)a = 0 (mod p), 
which can’t be true, since 0 < r + s < p. 

Thus, a, 2a, 3a, ^ a represent, in some order, each of the ^ 

integers 1 , 2, 3 ^ with either a plus sign or a minus sign. So, we 

can write 


a(2a)(3a) ■ ■ ■ — (±1)(±2)(±3) • • • 

Therefore, dividing by 1, 2, 3, ... , ^ , we g et 

= (-1)" (mod p), 



(mod p). 


where n is the number of minus signs. 
Thus, by Euler’s criterion, Theorem 8.1, 



This completes the proof. 
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Gauss's lemma allows us to conclude that for an odd prime p 

1, if p = 1 (mod 4); 

-1. if p = 3 (mod 4), 

which was also the conclusion of Theorem 6.4. This follows immedi- 
ately, since the number of minus signs for the residues for 

-1, 2(— 1), 3(— 1), ..., ^y-’-(-l) 

from - to is even if p is of the form p = 4k + 1 and odd if p is of 
the form p = 4k + 3, since the number of minus signs is ^ in either 
case. 

Similarly, we can use Gauss’s lemma to determine for each odd 
prime p. For a = 2, Gauss’s lemma says to look at the numbers in the list 

2, 2-2, 3-2 • 2 ; 

that is, look at the numbers 

2, 4, 6, .... p-1. 

For example, for p = 13, this list is 

2, 4, 6, 8, 10, 12, 

which reduces to 

2, 4, 6, -5, -3, -1; 

and so, by Gauss’s lemma, (^) = (-1) 3 = -1. 

For p = 17, almost the exact same thing happens, since 

2, 4, 6, 8. 10, 12, 14, 16 

becomes 

2, 4, 6, 8, -7, -5, -3, -1; 
and(^)=(-l) 4 = l. 

What is clear is that the reason, in each of these two cases, that 
the number of negative numbers and positive numbers is the same is 
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because 13 and 17 are both of the form 4/c + 1, but in one case the 
number of negative numbers was odd, and in the other case the number 
of negative numbers was even. Therefore, we conclude that 2 will be a 
quadratic nonresidue for any prime p of the form 8k + 5 and a quadratic 
residue for any prime of the form 8k + 1 . 

Next, let p = 11; in this case the list 

2, 4, 6, 8, 10 


becomes 


2, 4, -5, -3, -1, 

and, by Gauss’s lemma, (^-) = (-1) 3 = -1. 

For p = 7, almost the exact same thing happens, since 

2, 4, 6 


become 


2, -3, -1, 


and, (§) = (-1) 2 = 1. 

So for primes of the form 4 k + 3, there is one more negative number 
than positive number in the list, and we conclude that 2 will be a 
quadratic nonresidue for any prime p of the form 8k + 3 and a quadratic 
residue for any prime of the form 81c + 7. 

Thus, using Gauss’s lemma, we have proven the following theorem, a 
result known to Fermat, and one that was also proven— but with much 
greater difficulty — by Euler and Lagrange. 

Theorem 8.3. 


/ 2\ Jl, ifp=lor7 (mod 8); 

\p) \-l, ifp = 3orS (mod 8). 

We can also use Gauss’s lemma to determine for each odd 
prime p. We begin with the prime p = 13, and look at 

3, 6, 9, 12, 15, 18, 

which reduce to 

3, 6, -4, -1, 2, 5. 

Hence, by Gauss’s lemma, (^) = (-1) 2 = 1. 
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In this case, where p is of the form 12 k + 1, it possible to see exactly 
what is going on. The first two numbers, 3 and 6, fall between 0 and f , 
and are positive; the next two numbers, 9 and 12, fall between | and p, 
and therefore are going to end up being negative; the last two numbers, 
15 and 18, fall between p and and are again going to be positive. 

We leave the details of verifying the following theorem to Prob- 
lem 8.15. For example, you can check that for p = 19, since 19 is 6 
more than 13, which was the example we just looked at, there would 
be three numbers on the list 3, 6, 9, . . . , 27 between 0 and |, three 
numbers between f and p, and three numbers between p and \ p, and 
so, (^) = (-1) 3 = -1. 


Theorem 8.4. 


/3\ _ ( 1, if p = 1 or 11 (mod 12); 
\p) ~ \-l, if p = 5 or 7 (mod 12). 


Euler's Conjecture 

Euler, without benefit of proof, but relying on massive calculations, 
conjectured that the beautiful patterns we have seen for the quadratic 
nature of a = 2 and a = 3 in Theorems 8.3 and 8.4 also hold 
more generally for other values of a. The most important feature of his 
conjecture is that whether a number a is a quadratic residue of p does 
not really depend on the prime p itself; rather, it depends only on the 
remainder of p modulo 4 a. 

We saw in Theorem 8.3 that the quadratic nature of a = 2 depends 
not on p itself, but only on the remainder of p modulo 8, and we saw in 
Theorem 8.3 that the quadratic nature of a = 3 again depends not on 
p, but only on the remainder of p modulo 12. So too, then, by Euler's 
conjecture, the quadratic nature of a = 5 for a prime p will depend not 
on p itself, but only on the remainder of p modulo 20. 

Therefore, since 9 2 = 5 (mod 19), we know that 5 is a quadratic 
residue of 19, and so, by this conjecture of Euler’s, we would immedi- 
ately see that 5 is also a quadratic residue for all primes p congruent to 

19 modulo 20, that is, for p = 59, 79, 139, 179, 199, 239 

We saw earlier that 5 is a quadratic nonresidue of 13. Therefore, by 
this same conjecture of Euler’s, 5 should also be a quadratic nonresidue 

for the primes p = 53, 73, 113, 173, 193, 233 

The other important feature of Euler's conjecture involves the sym- 
metry that is so evident in Theorems 8.3 and 8.4. For a number a, the 
quadratic nature of a for a prime p with remainder r modulo 4 a is the 
same as it is for a prime q with remainder 4a - r. For example, for a = 2, 
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if p has remainder 1 modulo 8, and q has remainder 7 = 8-1, then 2 
is a quadratic residue of both p and q; on the other hand, for a — 3, if p 
has remainder 5 modulo 12, and q has remainder 7 = 12 — 5, then 3 is a 
quadratic nonresidue of both p and q. 

Similarly, continuing from the point above where we concluded 
from Euler’s conjecture that 5 is a quadratic residue for all primes with 
remainder 19 modulo 20, we could now also conclude, by the symmetry 
feature of Euler’s conjecture, that 5 is a quadratic residue for primes with 

remainder 20- 19 = 1, that is, for 41, 61, 101, 181, 241, 281 

Since Euler’s conjecture tells us that 5 is a quadratic nonresidue for 
primes with remainder 13 modulo 20, then the symmetry feature of 
Euler’s conjecture tells us that 5 is also a quadratic nonresidue for primes 

with remainder 20 - 13 = 7, that is, for 7, 47, 67, 107, 127, 167 

Here is a precise statement of Euler’s conjecture. 

Theorem 8.5 (Euler's conjecture). Let p be an odd prime, and let a be 
relatively prime to p. Also, let r be the remainder of p modulo 4 a. Then 
whether a is a quadratic residue of p depends only on r . 

Moreover, ifq is a prime whose remainder is 4a — r modulo 4a, then a is a 
quadratic residue of p if and only if a is a quadratic residue ofq. 

We will not give a detailed proof of Euler’s conjecture. The proof 
is fairly routine— albeit somewhat messy — since by Gauss’s lemma it 
is a matter of determining how many of the numbers in the list 
a , 2a, 3a, ... , ^-a have a negative remainder in the interval from 
to Lzl t which amounts to counting how many of these numbers 
fall in the intervals from | to p, or to 2p, or |p to 3 p, and so on. We 
provide the necessary steps to prove Euler’s conjecture in Problems 8.16 
and 8.17. 


The Law of Quadratic Reciprocity 

Euler’s conjecture is, in fact, equivalent to the law of quadratic reciprocity, 
which was first stated by Legendre in 1785, and which ties the quadratic 
nature of two primes p and q together into a single elegant formula, 
making it one of the most beautiful and important theorems in number 
theory. 

Theorem 8.6 (the law of quadratic reciprocity). If p and q are distinct 
odd primes, then 


P 
‘ l 
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Before we prove this theorem, let's spend some time getting a sense 
of the power behind this remarkable formula. With considerable effort 
Euler could deduce, as we just saw, that 5 is a quadratic residue of the 
prime 239. In Problem 8.16, we see that, without too much trouble, 
Gauss's lemma can be used to reach the same conclusion. The law 
of quadratic reciprocity, however, turns this same question into pure 
calculation. 

Of course, calculations become routine only once you get the hang 
of them. Here is how this one goes. First, note that (^) = (|) = 1. 
This is because 239 = 4 (mod 5), and then because 4 = 2 2 is a quadratic 
residue of 5 (or of any other prime, for that matter). So, by the law of 
quadratic reciprocity, since (^P) = 1 , the calculation becomes 


5 

239 



(- 1 ) 


5-1 239-1 
2 2 


(~ 1) 2119 = 1 , 


and so 5 is a quadratic residue of 239. 

This is a good time to think about why Theorem 8.6 is called a 
reciprocity law. In this example, we were asking whether 5 is a quadratic 
residue modulo 239. How did we answer that question? We did it by 
answering the much easier question: is 239 a quadratic residue modulo 
5? (The answer was yes, because 239 = 4 (mod 5), and 4 is obviously a 
quadratic residue.) In other words, we turned the question on its head. 
That’s why this is called a reciprocity law. Theorem 8.6 allows us to 
discover the quadratic nature of p modulo q by looking at the quadratic 
nature of q modulo p, which may be easy. 

The Legendre symbol even invokes this reciprocity in a very visual 
way because which after all represents the quadratic nature of p 

modulo q , and which represents the quadratic nature of q modulo 
p, are indeed virtual reciprocals of one another, since, by Theorem 8.6, 



Let’s do another example. As we saw earlier, Euler knew that 5 is a 
quadratic nonresidue of 127, by the symmetry part of his conjecture. 
Using Theorem 8.6, we can show this far more easily with a single 
computation. First, we answer the “reciprocal” question, by noting that 
(pp) = = -1 (by Theorem 8.3), and so, 


5 

127 




5-1 127 - 

(- 1 ) ^ 2 


= - (-if 63 = -1. 
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In both of these examples, it turned out that the exponent ^ is 
even. Therefore, 5 is a quadratic residue of 239 because 239 is a quadratic 
residue of 5; similarly, 5 is a quadratic nonresidue of 127 because 127 is 
a quadratic nonresidue of 5. In fact, an important way to think of the 
law of quadratic reciprocity is that and ^ j differ if and only if p 
and q both are of the form An + 3. (This is the only way the exponent 
can be odd.) This leads to the visual representation of the law 
of quadratic reciprocity presented in Figure 8.2. 

In this figure, a Yes or No means that p is either a quadratic residue or 
a quadratic nonresidue, respectively, of q; whereas an upside down Yes 
or No means that, in the other direction, q is either a quadratic residue 
or a quadratic nonresidue, respectively, of p. The shaded squares are the 
squares where both p and q are of the form An + 3 and, hence, contain 
one Yes and one No. The unshaded squares contain the same word Yes 
or No in each direction. 

Now, let's do an example to show that we are also able to deal with the 
quadratic nature of composite numbers. We ask the question: is 1984 

q 
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Figure 8.2 The law of quadratic reciprocity. 
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a quadratic residue of the composite number 2001? This amounts to 
asking whether the quadratic congruence 

x 2 = 1984 (mod 2001) 

has a solution. Since 2001 = 3-23-29, this congruence has a solution 
if and only if all three of the congruences 

x 2 = 1984 (mod 3), x 2 = 1984 (mod 23), x 2 = 1984 (mod 29), 
have solutions. 

Next we deal with each of these congruences separately. Since the 
modulus for each of these congruences is prime, we can use the multi- 
plicative property of the Legendre symbol: ^y^ = j 

Thus we evaluate ) for each of these three primes, which we do 
by writing f^y^ = (y) (y) = (y) (since 64 = 8 2 is a quadratic 
residue) and then evaluating ^y ^ for each prime p. 

In Problem 8.18, you are asked to show that (y) = 1 and (||) = 1, 
but (||) = -1. Therefore, the original congruence does not have a 
solution, and 1984 is not a quadratic residue of 2001. Imagine having 
to compute l 1 , 2 2 , 3 2 , . . . , 1000 2 modulo 2001 just to find that out! 

Note, however, that had it been the case that each of these three con- 
gruences had a solution, then each would have had two solutions; so we 
could have then used the Chinese remainder theorem, Theorem 6.2, to 
solve the resulting eight systems of simultaneous congruences; hence 
the original congruence would have had a total of eight solutions. 

We are now ready to prove the law of quadratic reciprocity. There 
are many proofs known for this theorem. Two are presented here to 
illustrate the vastly different perspectives from which such a funda- 
mentally important theorem can be viewed. The first proof is largely 
computational in nature, but it does rely heavily on Euler’s conjecture— 
which we did not prove. This proof was given by Scholz in 1939. 

First Proof of Theorem 8.6 

We will consider two cases. If p and q are congruent modulo 4, then we 
can assume, without any loss of generality, that p > q, and we write 
p - q — 4a; thus 


P 

<7 


(^) 
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and 


i 

P 





In this case, since p = q (mod 4 a), we have, by Theorem 8.5, = 

(f);and so 


-1 


= ± 1 , 


and this is +1 or -1, respectively, according to whether p is of the form 
4 n + 1 or An + 3, by Theorem 6.4. 

If p is of the form 4 n + 1, then ^ is even, and so = 1 = 

. On the other hand, if p is of the form An + 3, then q is also 
of the form An + 3, and both ^ and ^ are odd, and so^=i)=-l = 

, ,.£= 1 . 4=1 

(- 1 ) ^ 2 . 

Therefore, in this case, 


0(^ = (-D^ 


Now, in the case where p and q are not congruent modulo 4, we write 
p + q = Aa; thus 



Since, in this case, p = -q (mod Aa), we have, again by Theorem 8.5, 

(p) = (f ); and so (f ) = (p) ’ and 
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Figure 8.3 Gotthold Eisenstein, 
1823-52. 


In this case, one of p or q will be of the form An + 1 , and so, one of 
or 2M. is even. Therefore, 




= 1 = (- 1 ) 


tl.a zl 
2 2 


This completes the first proof. 


The second proof of the law of quadratic reciprocity is due to 
Ferdinand Gotthold Max Eisenstein, who visited Gauss for a few weeks 
in 1844 in Gottingen, where Gauss had been the director of the ob- 
servatory since 1807, and four years after Eisenstein’s proof had been 
published. Gauss, who was certainly not known for praising the work of 
others, claimed that there had been only three “epoch-making” math- 
ematicians: Archimedes, Newton, and Eisenstein. Gotthold Eisenstein 
died in 1852 of tuberculosis at the age of twenty-nine. 

Eisenstein’s proof has a very different character from the first proof. 
It is a combinatorial proof, which is to say that it involves a counting 
argument, and is very attractive, once you get the hang of the new 
approach. 

Second Proof of Theorem 8.6 

There are several key ideas in this proof. The first key idea is that we 
make note of the term ^ in the law of quadratic reciprocity, and 
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represent this number as the number of lattice points — that is to say, 
points whose coordinates are both integers— inside a rectangle in the 
first quadrant of the plane whose lower left corner is the origin (0, 0), 
and whose upper right corner is the point (f , |). You should draw a 
picture at this stage. (Note that the corner point, (§, §), is not a lattice 
point because neither | nor | is an integer.) 

The lattice points inside this rectangle form a rectangular array of 
points (this should be reminding you of a Greek-style diknume proof). 
The bottom row of lattice points are (1, 1), (2, 1), . . . (^, 1), and the 
left-hand column of lattice points are (1, 1), (1, 2), . . . , (1, 2^1). This 
means there are exactly 


P~ i q - 1 
2 ' 2 

lattice points inside this rectangle. That’s the first key idea. (Inciden- 
tally, that’s what we mean by a “counting argument”; as a first step, we 
have just counted the number of lattice points inside this rectangle.) 

Now we are going to count the number of lattice points inside this 
rectangle in a different way. That’s the way diknume proofs tend to 
work; look back at the proof of the formula for triangular numbers in 
Figure 2.2 — there, you count the number of points in a rectangle in two 
different ways (and end up with an important formula). Of course, this 
proof is harder, but the basic idea is still the same. 

At this point Eisenstein does a very strange thing, but in the end 
it will turn out to make the details of his proof far less complicated 
than they would otherwise be. He now switches to a rectangle which 
is twice as large in each direction, with the lower left corner still at the 
origin (0, 0), but the upper right corner at the point ( p , q ). So, trust 
Eisenstein, and draw a picture of this rectangle, including the lattice 
points inside this rectangle. Using graph paper, letting p — 19, q = 13, 
is recommended (see Problem 8.21). 

You might even want to combine the two pictures into a single 
picture, since the original small rectangle forms the lower left quadrant 
of the large rectangle, and the point (|, |) is at the exact center of the 
large rectangle. 

Next, motivated completely by the symmetry in the expression ^ • 
2^-, Eisenstein divides the rectangle by a diagonal line running from 
the origin (0, 0) to the opposite corner at the point (p, q) (and passing 
through ( | , | ) ) • If you have done a very careful drawing, you will notice 
that this diagonal line does not pass through any of the lattice points 
inside the large rectangle. Suppose, by contradiction, that there is a 
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lattice point (a, b) inside the large rectangle on this diagonal. Then 
there will be two similar right triangles in your picture, one with height 
b and base a, and the other with height q and base p. Therefore, | = |, 
which is impossible, since p and q are prime, and a < p, b < q. 

That’s the second key idea in the proof. The diagonal line “splits” 
the rectangular array of lattice points in the large rectangle into two 
identical parts: the lattice points above the diagonal line and the lattice 
points below the diagonal line. This is because of symmetry. If you turn 
a picture of the large rectangle and the lattice points upside down, it is 
exactly the same. 

If you drew a 13 by 19 rectangle, this would be a good time to check 
that, in your drawing, you have below the diagonal, in the 18 columns, 
respectively, 0, 1,2, 2, 3, 4, 4, 5, 6, 6, 7, 8, 8. 9, 10, 10, 11, 12 lattice 
points. Therefore, by symmetry, you should have, in the 18 columns, 
respectively, 12, 11, 10, 10, 9, 8, 8, 7, 6, 6, 5, 4, 4, 3, 2, 2, 1, 0 lattice 
points above the diagonal. 

Now we come to the third key idea, and the clever part of Eisenstein’s 
proof. Eisenstein counts the number of lattice points inside the small 
rectangle below the diagonal line by counting the number of lattice 
points in the large rectangle below the diagonal line in the even numbered 
columns ! 

The reason he can do this is subtle. His goal is to count the number 
of lattice points inside the small rectangle below the diagonal line. As 
he starts out counting even-numbered columns, he is at first inside 
the small rectangle. What happens when he gets to even-numbered 
columns to the right of the small rectangle? For an even-numbered 
column i, where i > |, the lattice points above the diagonal in column 
i correspond exactly to the same number of lattice points below the 
diagonal in the odd-numbered column p — i in the original small 
rectangle. We observed this symmetry earlier. 

So, instead of counting the lattice points in the odd column p - i 
below the diagonal, he counts the lattice points in the even column 
i below the diagonal. Why does this work? Well, because he doesn’t 
need the exact number, he only needs the “parity” to be right. The total 
number of lattice points in any column in the rectangle is q - 1 , which 
is even; so, for any column i, the parity of the number of lattice points 
below the diagonal in this column will be the same as the parity of the 
number of lattice points above the diagonal in this column — that is, 
if there are an odd number of lattice points in the column below the 
diagonal, then there will be an odd number above the diagonal; or, if 
there are an even number below the diagonal, there will be an even 
number above the diagonal. 
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In other words, by counting the number of lattice points below 
the diagonal in the even-numbered columns in the large rectangle, 
Eisenstein is counting — that is, he is correctly counting the parity of — 
the number of lattice points below the diagonal in the original small 
rectangle. This will be good enough, because the parity of this count is 
all we are really interested in, since it is the expression (-l)V that 
appears in the law we are trying to prove. 

Before we continue with this proof, let’s see how this parity idea is 
working in the 19 by 13 rectangle we hope you have been drawing. 
When Eisenstein counts the number of lattice points below the diag- 
onal in the even-numbered columns in this rectangle, he gets 1 + 2 + 
4 + 5 + 6 + 8 + 9 + 10 + 12 = 57; whereas, if we were to count the number 
of lattice points below the diagonal in the small rectangle, we would get 
0 + 1 + 2 + 2 + 3 + 4 + 4 + 5 + 6 = 27. These two numbers, 57 and 
27, have the same parity — that is, they are both odd — and, therefore, 
(-1) 57 = ( — l) 27 , which is all that matters. 

Now, for the actual counting — the fourth key idea in the proof — we 
need a way to represent the number of lattice points inside the rectangle 
below the diagonal line. Since the equation of the diagonal line is 
y = |x, we see that any column of lattice points inside the rectangle 
above an integer i on the x-axis will be above the point (i, 0), and those 
lattice points in this column below the diagonal line will be below the 
point (i, ‘-L i) on the diagonal. Therefore, this column of lattice points 

will consist of the points (/, 1), (i, 2), (/', 3), . . . , (Y L^j)- Hence, this 

column will have exactly lattice points. 

So, if we add up the number of lattice points in each even-numbered 
column starting at i = 2, and going to the last even-numbered column 
on the right at i = p - 1, we see that there are 

p-i 

E 

/ — 2, even 

lattice points in even-numbered columns inside the large rectangle 
below the diagonal line. 

As you can probably sense, we are getting close to finishing the proof, 
but we still have to show what the number of lattice points below 
the diagonal in the even-numbered columns in this rectangle has to 
do with quadratic reciprocity. This is the fifth key idea in the proof, 
and this idea about the relationship between lattice points and the 
quadratic nature of a prime q modulo a prime p is itself important 
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enough that it is known as Eisenstein’s lemma : 

Theorem 8.7 (Eisenstein's lemma). If p and q are two distinct odd primes, 
then 


= (_l)£f=2.evenLfJ 

where the sum is taken over the even integers, from 2 to p — 1. 

Proof of Eisenstein's lemma 

For each even integer i from 2 to p - 1, by the division algorithm, we 
can write 


i *4 i 

w = LyJ p+n, 

where r,- is the residue of iq modulo p, with 0 < r, < p. We note, because 
it will soon become important, that since i is even, r ; and have the 
same parity— that is, r, = |_^J (mod 2). 

Now, the ^ numbers in the list 

2 q, 4 q, (p - l)q 

are distinct modulo p, hence the ^ numbers 

r 2 , r 4 , . . . , r p ^i 

are distinct modulo p as well. But, more than that, just as we saw in the 
proof of Gauss’s lemma, Theorem 8.2, the ^ numbers 

(-D r2 r 2 , (-l) r4 r 4 , .... (-1 Y'-'rp-i, 

are also distinct modulo p. Therefore, these numbers must be congru- 
ent, in some order, to the ^ even numbers 

2, 4, ..., p-1. 

This is because if r, is even, then ( — l) ri r,- = r, ; whereas, if r, is odd, then 
(-1 ) n ri = -r; is even. 

So we can write 


((-1 Y 2 r 2 ) ■ ((— l) r4 r 4 ) • • ■ ((— = 2 • 4 • • ■ (p - 1) (mod p), 
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but r, = iq (mod p ) for each even i, so we can rewrite this as 


((-iy2 2 q).((-iy<4q) ■ ■ ■ ((-l) r ^(p- l)g) = 2-4- • • (p-1) (mod p). 


Dividing by 2 • 4 ■ • • (p - 1) produces 

(-l)^= 2 ' evenr ' = 1 (mod p), 

which we rewrite as 

= (-l)^r= 2 .even r i (mod p). 


so that we can apply Euler’s criterion, Theorem 8.1, to get 

^ ^ ^ — (_ 1 ) Sf= 2 . even r < 

The final step is to recall that r,- = |_^J (mod 2) for each even i, and 
so 


= (— 1)^-''= 2 ' even r ' = (— 1)^£= 2 - ev en L "p J 

This completes the proof of Eisenstein’s lemma. ■ 

We can now quickly finish the proof of the law of quadratic reci- 
procity. 

Up to this point, we have only counted the lattice points in the 
original small rectangle that are below the diagonal, and of course 
we also need to count the ones that are “above”— or, more accurately, 
we should say to the left of — the diagonal in this rectangle. But, by 
symmetry, we can do that in exactly the same way, counting lattice 
points in rows to the left of the diagonal line. 

We are now ready to put everything together. Eisenstein’s lemma 
allows us to express the two Legendre symbols and in terms of a 
count of the even rows and columns of the large rectangle. The third key 
idea from the proof then tells us that this count of lattice points from 
even rows and columns of the large rectangle is the same, modulo 2, as 
the count of lattice points above and below the diagonal in the original 
small rectangle. And the very first key idea tells us that the total number 
of lattice points inside the original small rectangle is given by ^ 
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Hence we conclude that 

^ ^ = (_l)£UevenLfl . (_!)£& even L* J 

= (_1)£m L^J . J 

= 

= (-D 2 2 . 

This completes the second proof of the law of quadratic reciprocity. ■ 


Problems 

8.1 * (S) Prove that the statement 

every positive integer can be written as a sum of three or fewer 
triangular numbers 

is equivalent to the statement 

every positive integer of the form 8 n + 3 is a sum of three odd 
squares. 

Illustrate this equivalence by writing 115 as a sum of three squares by 
first writing 14 as a sum of three triangular numbers. 

This equivalence verifies Gauss’s claim that 
EYPHKA! num = A + A + A . In Chapter 7, we mentioned that any 
number not of the form 4"(8 k + 7) can be written as a sum of three 
squares (see Problem 7.32). So every number of the form 8n + 3 is a sum 
of three squares, and these three squares must all be odd, since any 
square is congruent to 0 or 1 modulo 4. 

8.2 (S) Gustav Peter Lejeune Dirichlet, who was Eisenstein’s doctoral thesis 

adviser at the University of Berlin, found a remarkable formula that 
says if 8 n + 3 is prime, then the number of ways of representing n as a 
sum of three triangular numbers is that number by which the number 
of quadratic residues of 8n + 3 in the interval from 1 to 4 n + 1 exceeds 
the number of quadratic nonresidues in the same interval. Verify that 
Dirichlet’s formula works for the number n — 5. 

8.3 * (S) A good method for finding all the primes from 1 to n is to list all the 

numbers from 2 to n — in practice it is a good idea to do this in fairly 
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neat rows, say, by tens— and then, cross out all multiples of primes, 
leaving just the primes. You begin by crossing out all multiples of 2 
(except 2, of course); then cross out all multiples of 3 (except 3); then 
multiples of 5; and so on. 

At any stage, after you have crossed out all multiples of a given prime 
p, the smallest number greater than p remaining in the list that is not 
yet crossed out will be the next prime greater than p, so this process 
generates the list of primes as you go. 

You stop this process as soon as you reach the last prime p < «Jn and 
cross off its multiples— that is, as soon as you know that the next prime 
q on the list will have to be greater than *Jn. This works because if ab is a 
composite number less than or equal to n, then one of the numbers a or 
b must be less than or equal to y/h (they can’t both be greater than *Jn). 

Use this method to find 7r(200), where jt(x) represents the number 
of primes less than or equal to x. 

This method for finding primes is called the sieve of Eratosthenes. 
Eratosthenes was a contemporary of Archimedes, and the head 
librarian of the great library of Alexandria during the late third century 
B.C. He is best known for being the first person to accurately measure 
the circumference of the earth. At noon on the summer solstice in 
Alexandria Eratosthenes observed that the sun was one-fiftieth of a full 
circle south of straight overhead, while at the same time in a city 5000 
stadia south of Alexandria, it was straight overhead. So he could 
calculate the circumference of the earth to be 250 000 stadia. He also 
measured the distances to the sun and the moon, and the tilt of the 
earth’s axis. 

8.4 ★ (S) The prime number theorem, first conjectured by Gauss as a teenager, 

says that n(x), the number of primes less than or equal to x, is 
approximately where In x is the natural logarithm function. 

For each of the following values of x, as well as for the value x = 200 
from Problem 8.3, compare the true value of it(x) with the 
approximation ^ given by the prime number theorem. 

7t(500) = 95; 

7r(1000) = 168; 

;r(5000) = 669; 
tt(IOOOO) = 1229; 

7r(100 000) = 9592; 
tt(1 000 000) = 78 498. 

8.5 (S) What we mean when we say that ^ is a “good” approximation for 

7z (x) is that for large values of x the ratio of these two quantities is close 
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to 1. In other words, a precise statement of the prime number theorem is 
that 


lim 

x-»oo 


_£ [W_ 

a:/ In a 


= 1 . 


In this situation, we say that the two functions jt (a) and ^ are 
asymptotically equivalent. 

For each value of a in Problem 8.4, including a = 200, compare the 
ratio of tt(a) and ^ with 1. 

Gauss found another function that is asymptotically equivalent to 
tt(a); in fact, he believed it gives an even better approximation for ;r(x) 
than does. This function is called the logarithmic integral, and is 
denoted by U(a), 



For each value of a in Problem 8.4, including a = 200, compare the 
true value of 7t(a) with the approximation Li(x); and then, for each of 
these values of a, also compare the ratio of jt(a) and U(a) with 1. 

Do you agree with Gauss that Li (a) is a better approximation for 7t(a) 
than j^? 


8.6 (H,S) Prove that the logarithmic integral Li(x) and are asymptotically 
equivalent; that is, prove that 


*^oo a/ In A 

8.7 * (S) Fermat knew which primes are sums of two squares — that is, 

Theorem 5.1 — by 1640, but it wasn’t until 1654 in a letter to Pascal that 
he stated his results for primes of the forms p = x 2 + 2 y 2 and 
p — a 2 + 3 y 2 . In 1658 he wrote another letter explaining all three of 
these results. Here is how he phrased Theorem 5.1 in that letter: 

Every prime number which surpasses by one a multiple of four is 
composed of two squares. Examples are 5, 13, 17, 29, 37, 41, etc. 

He also stated his theorem for which primes are of the form 
p — a 2 + 3 y 2 . Fill in the blanks in Fermat’s statement of this result by 
deciding experimentally which odd primes up to 50 can be written in 
this form. 

Every prime number which surpasses by a multiple of is 

composed of a square and the triple of another square. Examples 
are _ , _ , _ , _ , _ , _ , etc. 
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Now, do the same for Fermat’s statement about the odd primes of the 
form p = x 2 + 2 y 2 . 

Every prime number which surpasses by or a multiple of 

is composed of a square and the double of another square. 

Examples are _ , _ , _ , _ , _ , _ , etc. 

8.8 (H) In 1744, Euler made a conjecture about which odd primes p are of the 

form p = x 2 + 5y 2 . Reproduce Euler's conjecture by deciding 
experimentally which primes up to 200 are of this form. 

Consult Problem 8.3 for a quick method to produce the primes less 
than 200. 

8.9 (H,S) Euler’s awareness of quadratic reciprocity grew out of his interest in 

the question of which odd primes have the form p = x 2 + ny 2 for a 
given positive integer n. Prove that if p is an odd prime, and n / 0 is an 
integer relatively prime to p, then there are relatively prime integers x 
and v such that 

p | x 2 + ny 2 

if and only if 



8.10 ★ (Do this problem without a calculator.) In Chapter 7, we first saw that 6 

is a quadratic residue of the prime 19, because 5 2 = 25 = 6 (mod 19). 
Verify that 6 is a quadratic residue of 19 using Euler's criterion, 
Theorem 8.1. 

We also saw in Chapter 7 that 3 is a quadratic nonresidue of 19, 
because it is not congruent to any of l 2 , 2 2 , 3 2 , . . . , 9 2 modulo 19. 
Verify that 3 is a quadratic nonresidue using Euler’s criterion. 

8.11 (H,S) Use Euler’s criterion, Theorem 8.1, to prove Theorem 6.4. 

8.12 (H,S) The Legendre symbol is an excellent example of the usefulness of 

good mathematical notation. On the other hand, the lack of good 
notation can have the opposite effect: Gauss was utterly amazed that 
Archimedes never developed the decimal system for numbers and 
lamented, “To what heights would science now be raised if Archimedes 
had made that discovery!" 
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Use the Legendre symbol to prove the following. 

(a) For primes p of the form p = 4n+ 1, a number a is a quadratic 
residue of p if and only if -a, that is, p - a, is a quadratic residue. 

(b) For primes p of the form p = 4n + 3, a number a is a quadratic 
residue of p if and only if -a, that is, p - a, is a quadratic 
nonresidue. 

8.13 ★ (S) Repeat Problem 8.10, but this time use Gauss’s lemma, 

Theorem 8.2, to show that 6 is a quadratic residue of 19, and 3 is a 
quadratic nonresidue. 

8.14 Verify Theorem 8.3 for the four primes 29, 31, 41, and 43, by finding 
values of a for two of these primes— in fact, two values of a for both of 
these primes p — such that a 2 = 2 (mod p); and, by showing that, for 
the other two primes, 2 is a quadratic nonresidue, by finding all the 
quadratic residues for these primes, and observing that 2 is not among 
them. 

8.15 (H,S) Prove Theorem 8.4. 

Problems 8.16 - 8.17 investigate how to prove Euler’s conjecture, 
Theorem 8.5. 


8.16 (S) The first part of Euler’s conjecture, Theorem 8.5, says that whether a is 
a quadratic residue of p depends only on the remainder r of p modulo 
4a. 

Illustrate the basic idea behind how a proof of this part of Euler’s 
conjecture would work by showing, for the number a — 5, and for each 
of the primes 19, 59, and 79, which of the five intervals each number in 
the list 


falls: 


5, 10, 15, ..., — -5 

2 



2 p 



Clearly describe any pattern that emerges; then explain why 5 must be 
a quadratic residue for any prime whose remainder is 19 modulo 20. 


Problem 8.16 illustrates the basic reason why the first part of Euler’s 
conjecture on quadratic reciprocity is true. But formulating an actual 
proof of that theorem needs more detail. We can describe how to 
provide that detail by looking at the examples in Problem 8.16 again. 
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We saw, for p = 19, that 10 and 15 fall in the interval [y , 19] and, 
by Gauss’s lemma, that creates two negative signs. Let’s see how to 
explain what effect that has on p = 59. The corresponding interval for 
p = 59 is [y , 59] , and we saw that there are six multiples of 5 that fall 
in this interval, and hence that produces six negative signs. But why 
are there again an even number? We need a new idea to explain this in a 
way that could be used in a rigorous proof. 

Think of starting with the original interval [y , 19], We know we 
have an even number of multiples of 5 in this interval. We add 40 to 
the right-hand endpoint of this interval to create a new interval, 

[y , 59] . Since we have increased the length of the original interval by 
40, this new interval will also still contain an even number of multiples 
of 5. Next, add 20 to the left-hand endpoint of this interval to create 
the interval [y , 59] . Since this time we decreased the length of the 
interval by 20, this interval will also still contain an even number of 
multiples of 5. This idea explains why the interval [y , 59] contains an 
even number of multiples of 5. What about the prime p = 79? If we 
add 20 to 59, and add 10 to y , we get the endpoints for the interval 
[y , 79] , and this interval will also contain an even number of 
multiples of 5. This is an inductive argument that works in general. 

It is worth stating this general idea clearly. Let [c, d] be any interval, 
possibly including negative values of c or d; then, for any integer value 
of k, if the interval [c, d] contains an even number of multiples of 5, 
then so does the interval [c, d + 10k]; and if the interval [c, d] contains 
an odd number of multiples of 5, then so does the interval [c, d + 10k]. 
Similarly for the interval [ c + 10k, d], (Note: even though the notation 
is inadequate, we intend this to include the possibility that c + 10k > d, 
and so this interval might actually become [d, c + 10k]; make sure you 
still believe the statement in this case.) 

8.17 (H,S) The second part of Euler’s conjecture, Theorem 8.5, says that if r is 
the remainder of p modulo 4 a, and if q is a prime whose remainder is 
4a - r modulo 4a, then a is a quadratic residue of p if and only if a is a 
quadratic residue of q. 

Use the idea about intervals described above to illustrate the basic 
idea behind how a proof of this part of Euler’s conjecture would work 
by taking the quadratic nonresidue a = 5 of the prime 13, and arguing 
that since the intervals [ y , 13 ] and [ y , 26 ] contain an odd 
number of multiples of 5, then the intervals [ \ , 7 ] and [ y , 14 ] also 
contain an odd number of multiples of 5, and, therefore, that a = 5 is a 
quadratic nonresidue of the prime 7. 

In other words, you are to illustrate an inductive step by showing that 
since there are an odd number of multiples of 5 in one pair of intervals, 
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then there are also an odd number of multiples of 5 in the other pair of 
intervals. 

8.18 * (H,S) We showed how the law of quadratic reciprocity, Theorem 8.6, can 

be applied to questions involving quadratic residues having composite 
moduli, for example, to the question: is 1984 a quadratic residue of the 
composite number 2001? 

The idea, based on the Chinese remainder theorem, is that since 
2001 = 3-23-29, the number 1984 needs to be a quadratic residue of 
all three primes, 3, 23, and 29, if it is to be a quadratic residue of 2001. 

Show that 1984 is not a quadratic residue of 2001 — that is, it is a 
quadratic nonresidue — by evaluating the three Legendre symbols 
(ifi),(i||i),and(Mi). 

8.19 (S) You may have wondered why so much of our attention — not to 

mention Gauss’s and Euler’s — has been on quadratic residues and the 
quadratic congruence x 2 = a (mod p). We have been completely 
ignoring the general quadratic congruence ax 2 + bx + c = 0 (mod p). 
Here’s why. 

A general quadratic congruence 

ax 2 + bx + c = 0 (mod p ), 

where a and p are relatively prime, and p an odd prime, can be reduced 
to a simple quadratic congruence of the form 

y 2 = d (mod p) 

by setting y = 2 ax + b and d = b 2 — 4 ac. 

Verify that this reduction works; and, in particular, verify that the 
linear congruence 2 ax + b = y (mod p) has a unique solution. 

Then, use this reduction to solve the congruence 

3x 2 — lx + 3 = 0 (mod 23). 


8.20 (S) Evaluate the following Legendre symbols: 


(a) 





8.21 * (S) Illustrate several of the ideas used in the second proof — Eisenstein’s 
proof — of the law of quadratic reciprocity, Theorem 8.6, by drawing, 
for the case p = 19, q = 13, the 13 by\9 rectangle used in the proof; also 


Gauss 


257 


draw the diagonal of the rectangle. Include within this drawing the 
small rectangle whose upper right-hand corner is at the point (f , |). 

It is best to use quad-ruled graph paper so that you can be sure the 
diagonal does not cross any lattice points. 

(a) Count the number of lattice points inside the small rectangle, 
and verify that this number equals ^ 

(b) Verify the third key idea in the proof by showing that for each 
even-numbered column i, where 10 < i < 18, the parity of the 
number of lattice points in column i below the diagonal is the 
same as the parity of the number of lattice points in the odd 
column p - i below the diagonal. Also show that these odd 
columns account for all the odd columns in the small rectangle. 

(c) Verify the key idea in the proof of Eisenstein’s lemma that the 
residues r; for the even multiples of q: 

2q , 4<7, . . . , (p- 1 )q 
have the property that the numbers 

(-1 Y 2 r 2 , (-1 ) r V 4 , ..., ( — l) rp 1 r p _i 
are, in some order, congruent modulo p to the even numbers 

2, 4, . . . , p- 1. 

(d) Evaluate (j|) using Eisenstein’s lemma, Theorem 8.7. 

(e) Evaluate ( j|) by counting lattice points in your diagram. 

8.22 (S) Gauss found a remarkable formula for writing a prime of the form 

p = 4k + 1 as a sum of two squares, p = a 2 + b 2 . The number a is simply 
the residue of \ ( 2 ^) mod p that lies between — | and |, and the number 
b is the residue of (2k) ! a modulo p that also lies between — | and f . 

First, use Gauss’s formula to express the prime 13 as a sum of two 
squares. Then, to see how quickly Gauss’s formula — as beautiful as it 
is — becomes completely impractical as primes get larger, use it to 
express the prime 89 as a sum of two squares. 


Primes I 


The object we have been studying in this book is the set of numbers 
called the natural numbers, that is, the set of positive integers 

1, 2, 3, 4, 5 

No subset of the natural numbers is more important, or more myste- 
rious, than the sequence of prime numbers 

2, 3, 5, 7, 11, 13, 17, 19, 23 

That primes play the central role in the study of numbers has been 
recognized since ancient times, yet even today we are nowhere near any 
complete picture of exactly how they fit into the scheme of things. Paul 
Erdos, who perhaps knew more about primes than any human who ever 
lived, warned us: 

It will be another million years, at least, before we understand the 

primes. 

It is my goal in these next two chapters to give you a sense of what 
Paul Erdos meant by this. After more than two thousand years, we can 
indeed say that we know an enormous amount about prime numbers, 
but it is what we don’t yet know that is even more fascinating! We will 
discuss, three broad topics concerning primes, as well as many open 
questions that arise along the way. The first two topics are fundamental, 
and were raised by Gauss in 1801: 

The problem of distinguishing prime numbers from composite 

numbers, and of resolving the latter into their prime factors, is 

one of the most important and useful in arithmetic. 

Gauss actually identified two closely related problems here; but, as 
we shall see in this chapter, these are in fact two different questions: 

1. Can we tell whether a number n is prime? 

2. Can we factor a number n into its prime factors? 
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The third broad topic — which we will discuss in the next chapter — 
concerns how the primes are distributed among the integers. This is 
the real source of the mystery about the primes. On the one hand, 
there seems to be almost no predictability at all; on the other hand, 
from some perspectives, there is astonishing regularity. You can feel that 
something profound must be going on. But, as Erdos said, it may take a 
million years to fully understand what it is. The best we can do is chip 
away at it, one idea at a time. 


Factoring 

As for the problem of factoring a number n into its prime factors, Gauss 
went on to say, “we must confess that all methods that have been 
proposed thus far are either restricted to very special cases or are so 
laborious and difficult that even for numbers that do not exceed the 
limits of tables constructed by estimable men, they try the patience of 
even the practiced calculator. And these methods do not apply at all to 
larger numbers.” 

It really is extraordinary that, more than two hundred years later, and 
with fast, powerful computers available to us, we still have exactly the 
same complaint that Gauss did: factoring is too hard, and takes way too 
long. 

In 1977, Ronald L. Rivest, Adi Shamir, and Leonard Adleman in- 
vented a public key encryption system that exploits our inability to factor 
large composite numbers. These RSA public key systems make possible 
millions of transactions that take place every day; for example, they 
protect your credit card number when you buy a plane ticket online. 
In Chapter 13 we will see that these RSA encryption systems are a direct 
application of Euler’s theorem, Theorem 7.1. So, you might want to give 
Euler a little thank you the next time you purchase something online, 
or make that wire transfer to your offshore account in the Cayman 
Islands. 

In Chapter 13 we will also see that these public key encryption 
systems work precisely because it is hard to factor large numbers. But 
why is factoring so hard? Well, the basic strategy, or algorithm, is to 
try to divide a number n by each of the primes 2, 3, 5, 7, 11, ... , up 
to y/n until you find a prime factor p. Then you repeat the process on 
the number j . The problem is that, as simple as this strategy is, this is an 
example of what is called an exponential-time algorithm in the sense that 
if you double the number of digits in the number n, then you roughly 
square the amount of time the algorithm takes. 

With a relatively small six- or seven-digit number n you can imagine 
checking the 200 or so primes up to Jn, but when you double the 
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number of digits, you’ll be testing more than 40 000 = 200 2 primes 
before you reach the square root of a thirteen-digit number. So it will 
take you more than 200 times as long! For example, using this trial 
division algorithm with a calculator and a table of primes you might be 
able to factor the number 2 146 189 in less than fifteen minutes. But 
using the same method — assuming you can find a table of primes up to 
a million — for the number 807 163 866 931 could take you well over 
fifty hours. Of course, with a sophisticated calculator that has a built-in 
factoring capability you can factor 2 146 189 virtually instantaneously 
and 807 163 866 931 in a matter of seconds. 

However, the harsh fact is that even with the best factoring methods 
currently known, and the fastest computers available, we can just barely 
hope to factor a number n that is the product of two primes p and q each 
having 100 digits. In fact, the current state of the art in factorization can 
be measured by the recent factorization of two extremely large numbers 
known as RSA-200 and RSA-768. 

In 2005, a team from Germany and the Netherlands successfully 
factored a 200-digit number, RSA-200, as a product of two primes each 
having 100 digits (hence the name of the number, RSA-200), using the 
number field sieve method. This factorization took more than two years 
and 80 AMD Opteron CPUs working in parallel. It has been estimated 
that this factorization would have taken a single 2.2 GHz Opteron 
processor seventy-five years to complete this same task! 

Here is RSA-200 and its factorization: 

27997833911221327870829467638722601621070446786955 

42853756000992932612840010760934567105295536085606 

18223519109513657886371059544820065767750985805576 

13579098734950144178863178946295187237869221823983 

35324619344027701212726049781984643686711974001976 

25023649303468776121253679423200058547956528088349 

x 

79258699544783330333470858414800596877379758573642 

19960734330341455767872818152135381409304740185467. 

Then, at 20:16 GMT on December 12, 2009, another team, also using 
the number field sieve method, found the factorization of RSA-768, a 
number with 768 binary digits (and 232 digits), a task which took two 
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years and many hundreds of machines (and would have taken a single 
2.2 GHz Opteron processor about fifteen hundred years): 


1230186684530117755130494958384962720772853569595334792197 
322452151726 4 0 0 50726 3 657518745 202199786 4 6 9 3 8 9 9 5 64749 427740 
6384592519255732630345373154826850791702612214291346167042 
9214311602221240479274737794080665351419597459856902143413 

3347807169895689878604416984821269081770479498371376856891 
243138 8 9 8 2 8 83793 8780 0 2 28761471165 2 5 31743 0877378144 6799 9 4 89 

x 

3674604366679959042824463379962795263227915816434308764267 

6032283815739666511279233373417143396810270092798736308917. 


This is about the point where doubling the number of digits makes 
the factoring problem unsolvable for us with current methods and 
technology. Faster computers will help somewhat, of course, but any 
real breakthrough on the factoring problem needs to come in the form 
of new factoring methods. Let us now take a look at one of the best 
current methods. 


The Quadratic Sieve Method 

In Problem 5.33, we introduced Fermat’s method for factoring a large 
number n, a method that works well when there is a factor that is 
relatively close to «Jn. 

For example, to factor n = 1643, we compute Lx/ 1643J = 40, and then 
look for a square in the sequence 

41 2 - 1643, 42 2 - 1643, 43 2 - 1643, .... 

Since 42 2 - 1643 = 121 = ll 2 , we found a square very quickly in this 
case, and we can write 

1643 = 42 2 - ll 2 = (42 + 11)(42 - 11) = 53 ■ 31, 


and we have successfully factored 1643. 
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In 1643, Mersenne wrote to Fermat, asking him to factor the number 
100 895 598 169. Fermat wrote back almost immediately with the fac- 
torization 100 895 598 169 = 898 423 ■ 112 303, although it is very 
unlikely he used his method in this case, since finding a square in the 
sequence 


317 641 2 - 100 895 598 169 , 
317 642 2 - 100 895 598 169 , 
317 643 2 - 100 895 598 169 , 


505 363 2 - 100 895 598 169 = 393 060 2 

takes 187 722 steps. We don’t know how he factored this number. 

Here is another example where Fermat’s method eventually works, 
but takes a long time because the factors of n again are a long way from 
Lv/nJ . Let’s use Fermat’s method to factor n = 1649. Since [V1649J = 
40, we look for a square in the sequence 

41 2 - 1649 = 32, 

42 2 - 1649 = 115, 

43 2 - 1649 = 200, 

442 _ 1649 = 287, 


and finally we get to 


57 2 - 1649 = 1600 = 40 2 , 


and we can write 

1649 = 57 2 - 40 2 = (57 + 40)(57 - 40) = 97 • 17. 

In this case, we clearly would have been far better off simply dividing by 
the primes 2, 3, 5. 7, ... , until we reached 17 instead of using Fermat’s 
method! 
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In 1981, Carl Pomerance — now at Dartmouth College and one of the 
leading number theorists in the world, but then at the University of 
Georgia — developed a factoring algorithm based on Fermat’s method 
called the quadratic sieve that suddenly made it possible to factor num- 
bers in the neighborhood of 100 digits or more. Here is a simple example 
to illustrate the basic idea. 

If we begin Fermat’s method for the number n — 1649 again: 

41 2 - 1649 = 32, 

42 2 - 1649 = 115, 

43 2 - 1649 = 200, 

44 2 - 1649 = 287, 


we notice that the two numbers on the right in the first and third lines 
together form a square when multiplied together: 


32 • 200 = 80 2 . 


Now, multiply the first and third lines together, modulo 1649, to get 
41 2 • 43 2 = 80 2 (mod 1649) , which we rewrite as 

(41 • 43 + 80)(41 -43 - 80) = 0 (mod 1649), 


and which we then reduce, modulo 1649, to 

194 • 34 = 0 (mod 1649). 


So any prime that divides 1649 must also divide one of 194 or 34. 

Hence any prime that divides 1649 will be found in the greatest 
common divisor of 1649 and 194, or in the greatest common divisor 
of 1649 and 34. In general, at this point we would use the Euclidean 
algorithm to compute these two greatest common divisors, but in this 
case, since 194 = 2-97 and 34 = 2 • 17, we can immediately spot the 
prime factors: 1649 = 97 • 17. 
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We can see that the main idea that makes this quadratic sieve method 
work is coming up with the numbers 32 and 200 whose product is 
a square in the first place. We can do this in general by recalling 
that in the prime decomposition of square numbers the exponents 
are even. So, for example, the reason 32 • 200 is a square is because 
32 • 200 = (2 5 )(2 3 5 2 ) = 2 8 5 2 , and the exponents are even. Hence, 
in general, in the quadratic sieve method, the main idea is to scan 
the numbers you get using Fermat’s method — that is, numbers such as 
32, 115, 200, 287, . . . , in this example — to find a combination of these 
numbers whose combined prime decompositions will have only even 
exponents. 

As you can imagine, when looking for this combination of numbers, 
it will be simpler from a practical point of view to use numbers with 
similar small primes in their decomposition. So, in the example we just 
did, the numbers 115 and 287 were not likely to be useful because they 
have 23 and 47, respectively, in their prime decompositions, and in this 
example at least, these are considered to be rather large primes. This is 
where the notion of a “sieve” comes into this method, as we will see in 
the next example. 

Let’s factor n = 27 667 with this quadratic sieve method. We begin 
by computing each of the following, and expressing each resulting 
number in terms of its prime decomposition, all the while looking for a 
combination of the numbers on the right-hand side that will produce a 
square by collectively having only even exponents: 


167 2 - 27667 = 2-3-37, 

168 2 - 27667 = 557, 

169 2 - 27667 = 2 • 3 ■ 149, 
170 2 - 27667 = 3 2 • 137, 

171 2 - 27667 = 2 • 787, 

172 2 - 27667 = 3 3 ■ 71. 

173 2 - 27667 = 2-3-13-29, 
174 2 - 27667 = 2609, 


175 2 - 27667 = 2-3-17-29, 
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176 2 - 27667 = 3 ■ 1103, 
177 2 -27667 = 2- 1831, 
178 2 - 27667 = 3 • 13 • 103, 
179 2 - 27667 = 2 • 3 7 , 

180 2 - 27667 = 4733, 

181 2 - 27667 = 2 • 3 2 • 283, 
182 2 - 27667 = 3 ■ 17 • 107, 
183 2 - 27667 = 2-41 -71, 
184 2 - 27667 = 3 ■ 2063, 
185 2 - 27667 = 2 ■ 3 ■ 1093, 
186 2 - 27667 = 13 2 • 41, 


At this stage we can spot four lines where the prime decomposition 
involves only the five primes 2, 3, 13, 41, 71: 

172 2 - 27667 = 3 3 • 71, 

179 2 - 27667 = 2 ■ 3 7 , 

183 2 - 27667 = 2-41 -71, 

186 2 - 27667 = 13 2 • 41, 


and collectively the exponent on each prime is even. 

We multiply these four lines together, modulo 27667, to get 


172 2 179 2 183 2 186 2 = 2 2 3 10 13 2 41 2 71 2 (mod 27667), 
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which we rewrite as 

(172 179 183 186 + 2-3 5 -13-41-71)(172-179-183-186 - 2-3 5 13-41 • 71) 

= 0 (mod 27667), 

and then reduce, modulo 27667, to get 

12128 ■ 25842 = 0 (mod 27667). 

Now it is just a matter of using the Euclidean algorithm to compute 
the two greatest common divisors. We begin with gcd(12128, 27667): 

27667 = 2 • 12128 + 3411, 

12128 = 3 • 3411 + 1895, 

3411 = 1 • 1895 + 1516, 

1895 = 1 ■ 1516 + 379, 

1516 = 4-379 + 0, 

so gcd(12128, 27667) = 379. 

Therefore, 379 is a factor of 27667, and we can now quickly write the 
complete factorization as 27667 = 73 • 379 without having to compute 
the other gcd. 

In this example, then, the sieve worked something like this: “ignore 
any line where the prime decomposition includes a prime greater than 
100.” This is the mesh size for the sieve. So, in this case, for the twenty 
lines we looked at above, the sieve allowed only seven lines to pass 
through, the four lines we ended up using along with the following 
three lines: 

167 2 - 27667 = 2-3-37, 

173 2 - 27667 = 2-3-13-29, 

175 2 - 27667 = 2-3-17-29. 

Once the sieve sifted out the large primes, we could, in this case, find 
by inspection four lines among the seven remaining lines such that the 
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total product of their prime decompositions would have only even 
exponents. In general, this step needs to be done by computer using 
routine techniques in linear algebra, where, for example, a prime 
decomposition 2 • 3 7 would be represented first as a vector 
(1, 7, 0, 0, 0, . . . ), and then as the vector 

(1, 1, 0, 0, 0 ), 

since we only care whether exponents are odd or even; the prime 
decomposition 2 • 3 • 13 ■ 29 would be represented as the vector 

( 1 , 1 , 0 , 0 , 0 , 1 , 0 , 0 , 0 , 1 , 0 , 0 ); 

and the prime decomposition 2 • 3 ■ 17 • 29 would be the vector 

( 1 , 1 , 0 , 0 , 0 . 0 , 1 , 0 , 0 , 1 , 0 , 0 ), 


and so on. 

In practice, when applying the quadratic sieve method, a very deli- 
cate balance needs to be found between setting a mesh size that is too 
fine — which results in long waits for numbers to show up with primes 
small enough to pass through the sieve — and setting a mesh size that is 
too coarse — which allows so many numbers to pass through the sieve 
that the linear algebra simply becomes completely unwieldy. 

Gauss would be quite impressed to see the quadratic sieve in action 
today. In 1994, the quadratic sieve was able to factor the 129-digit 
integer presented as a challenge by Martin Gardner in his August 
1977 Scientific American column that first introduced the RSA encryp- 
tion system to the world. It is still one of the very best methods 
we have for factoring numbers. A related method, the number field 
sieve, is far more complicated, but it can be used for numbers that are 
too large for the quadratic sieve. Another method, called the elliptic 
curve method, can also factor numbers which have prime factors of 
about 50 digits. 


Is n Prime? 

What do we do when we can’t factor a number n because it is too large? 
Can we at least determine whether n is prime? That is, is there a way to 
test a number to decide whether it is prime or composite? 

The obvious test — dividing a number n by each of the primes, 
2, 3, 5, 7, 11, ... , up to sfh — is fine for small values of n, but way 
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too slow for large values. Another test that is good in theory, but not 
in practice, is the converse of Wilson's theorem (see Problem 6.7): if 
(n - 1)! = -1 (mod n) for a positive integer n, then n is prime. This 
test is useless in practice because all known algorithms for computing 
factorials modulo n take exponential time. 

Fermat’s little theorem, Theorem 5.2, also provides us with a way to 
test a number n for primeness since it tells us that if n is prime, then 
a n = a (mod n) for any integer a. Therefore, if for some number a we 
should ever discover that a n ^ a (mod n), then we would immediately 
know that n is composite. For this test to be useful, however, we need 
to be able to compute a" modulo n efficiently. This can be done using a 
method known as fast exponentiation. 

In order to see how fast exponentiation works, let’s use this method 
to test whether n = 667 is prime. We will let a = 2 and compute 
2 667 modulo 667. If we discover that 2 667 ^ 2 (mod 667), then we can 
conclude that 667 is not prime. The basic idea of fast exponentiation is 
to compute 2 667 modulo 667 by a sequence of steps, starting with the 
number 2, that involve only the operations of squaring and multiplying 
by a = 2. For example, if we know the value of 2 333 modulo 667, we 
can quickly compute 2 667 = 2 • (2 333 ) 2 modulo 667. To compute 2 333 in 
this same way, we’d first need to know 2 166 . In turn, we could compute 
2 166 by squaring 2 83 . Continuing in this fashion, we see that in order to 
compute 2 667 , we need to compute each of the following powers of 2: 

2667 2 333 2 166 2 83 2 41 2 20 2 10 2 s 2 2 2 

So we begin with 2, and get 2 2 = 4. Then we can compute 2 s = 
2 • (2 2 ) 2 = 2 ■ 4 2 = 32, and 2 10 = 32 2 = 1024 = 357 (mod 667). Next, we 
have 2 20 = 357 2 = 127 449 = 52 (mod 667). Continuing, we get 

2 41 = 2 • 52 2 = 5408 = 72 (mod 667), 

2 83 = 2 • 72 2 = 10 368 = 363 (mod 667), 

2 166 = 363 2 = 131 769 = 370 (mod 667), 

2 333 = 2 • 370 2 = 273 800 = 330 (mod 667), 

2 667 = 2 • 330 2 = 217 800 = 358 (mod 667). 

Since 2 667 ^ 2 (mod 667), we conclude that 667 is not prime. 
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This, of course, was a lot of work just to learn that 667 = 23 -29 is 
composite, but the real point is that if we want to test a number that is 
twice as big as 667, we would only have to do one more step using this 
method. That's why fast exponentiation is an efficient method to use 
for large numbers. 

Note that in binary 667 = 1010011011 and we can use this binary 
representation to produce the powers of 2 to be computed. Reading 
1010011011 left to right tells us to begin with 2 1 ; the second binary 
digit is 0, so we square to get the next term 2 2 (that is, we double the 
exponent); the third binary digit is 1, so we square and multiply by 2 to 
get the next term 2 s (that is, we double the exponent and add 1). Thus, 
the exponents are just 1, 10, 101, 1010, . . . ; that is, 1, 2, 5, 10, . . . . 


Pseudoprimes 

We have seen that using Fermat’s little theorem together with fast ex- 
ponentiation is a very good way to test a large number n for primeness. 
However, this test can occasionally fail to resolve the question as to 
whether an integer n is prime. In fact, you have already seen a number — 
namely, n = 341 — for which this test produces a very surprising result. 
The number 341 is clearly composite since 341 = 11-31, and yet, as we 
saw in Problem 5.9, 2 341 = 2 (mod 341). 

Let’s revisit this problem and look at what happens when we apply 
our test to 341. Since in binary we can write 341 = 101010101, we need 
to compute 

2 1 2 2 2 s 2 10 2 21 2 42 2 85 2 170 2 341 

modulo 341. If we do this, we get, modulo 341, 

2, 4, 32, 1, 2, 4, 32, 1, 2 

(this happens because 32 2 = 1024 = 1 (mod 341)). Thus 2 341 = 2 
(mod 341). So, in this instance, the test fails to show us that 341 is 
composite. 

This is surprising, but it is hardly the end of the world, since as it 
turns out we can just try another value for the number a, and the test 
will indeed work for this new value of a. For example, let's repeat the 
test using a = 3, that is, by computing 3 341 . Now we need to compute 


3\ 3 2 , 3 s , 3 10 , 3 21 , 3 42 , 3 85 , 3 170 , 3 341 
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modulo 341. We get 

3, 9, 243, 56, 201, 163, 254. 67, 168, 

modulo 341. Therefore, since 3 341 ^ 3 (mod 341), we can conclude that 
341 is not prime. 

Hence, for the number n = 341, this test— Fermat's little theorem 
together with fast exponentiation — only temporarily failed to determine 
that 341 is composite. We simply had to try values for a other than 2 in 
order to get a definitive result. 

A composite positive integer n that, like 341, divides 2" - 2 is called a 
pseudoprime. In other words, a pseudoprime is a composite number that 
fools us because it behaves like a prime number with respect to Fermat’s 
little theorem for the value a = 2. In Problem 9.6 you will show that 
there are infinitely many pseudoprimes. 

On the other hand, in 1949, Paul Erdos showed that the number of 
pseudoprimes is considerably smaller than the number of primes in 
the sense that the number of pseudoprimes from 1 to n is significantly 
less than the number of primes from 1 to n predicted by the prime 
number theorem. 


Absolute Pseudoprimes 

In terms of testing an integer n for primeness, we don’t feel too con- 
cerned about pseudoprimes such as 341, because, although our test 
initially fails for a = 2, it seems that we can simply choose another value 
such as a = 3 and then redo the test. But could there be a number n 
for which this test would fail completely? That is, could there possibly 
exist a composite number n such that a n = a (mod n) for all integers 
a? In other words, since n would divide a n - a no matter what value of 
a we tried, our test would never be able to discover that n is composite. 
Unfortunately, such numbers do exist! 

A composite positive number n such that a n = a (mod ri) for any 
integer a is called an absolute pseudoprime. The smallest absolute pseudo- 
prime is 561 and was discovered in 1909 by R. D. Carmichael. As a 
result, these numbers are also sometimes called Carmichael numbers. 
Carmichael conjectured that there are infinitely many absolute pseudo- 
primes. In 1949, Paul Erdos declared that this seemed to be a very 
difficult question. He was right. This conjecture was not proven until 
1994, by W. R. Alford, Andrew Granville, and Carl Pomerance. Because 
of the existence of absolute pseudoprimes (Carmichael numbers), the 
test using Fermat’s little theorem is simply unable to detect that an 
absolute pseudoprime is composite. 
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On the other hand, absolute pseudoprimes are extremely rare. There 
are only seven absolute pseudoprimes less than 10 000: 

561 1105 1729 2465 2821 6601 8911 . 

Let’s prove that 561 is an absolute pseudoprime. Since the prime 
decomposition of 561 is 561 = 3 • 11 • 17, in order to show that a 561 = a 
(mod 561) for any integer a, all we have to do is show that a 561 = a 
modulo each of these three primes. We do this using Fermat’s little 
theorem as follows: 

a 561 = ( a 2 ) 280 a = (1 ) 280 a = a (mod 3), 

a 561 = (a 10 ) S6 a = (l) 56 a = a (mod 11), 

a 561 = (a 16 ) 3s a = (l) 3S a = a (mod 17). 

Therefore, 561 is an absolute pseudoprime. 


A Probabilistic Test 

When testing whether a large number n is prime, we are hoping to get 
a definitive yes or no answer. However, when faced with an extremely 
large number, this may be an impractical wish since it simply may take 
too long to get an answer. A good thing to do in such a situation is to 
compromise and instead use a probabilistic test. We’ll describe one such 
test using the absolute pseudoprime 561. 

Recall that for an odd prime p the congruence x 2 = 1 (mod p) has 
exactly two solutions: x = 1 and x = p — 1 (that is, x = -1). Now, since 
2 560 = 1 (mod 561), we see that 2 280 is a solution to the congruence 
x 2 = 1 (mod 561). Therefore, if 561 is prime, it follows that either 
2 280 = 1 (mod 561) or 2 280 = -1 (mod 561). 

Now, if 2 280 = 1 (mod 561), we can repeat this argument and 
conclude that either 2 140 = 1 (mod 561) or 2 140 = — 1 (mod 561). And 
if it is also the case that 2 140 = 1 (mod 561), then we repeat again and 
conclude that either 2 70 = 1 (mod 561) or2 70 = -1 (mod 561). Finally, 
if 2 70 = 1 (mod 561), we repeat one last time and conclude that either 
2 35 = 1 (mod 561) or 2 35 = -1 (mod 561). 

In summary, the test for primeness is to check to see if 2 35 = 1 
(mod 561) or if 2 35k = -1 (mod 561) for some k — 1,2, 2 2 , 2 3 . If 
this condition is not satisfied, then we know immediately that 561 is 
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not prime. If, on the other hand, this condition is satisfied, we cannot 
conclude with certainty that 561 is prime. This is where probability 
comes in. 

We have stated this test for the value a = 2 and for the number 
n = 561. It is known in general that if n is an odd composite number, 
the condition in this test will be satisfied by less than one fourth of 
the numbers from 1 to n - 1 . Let’s pick ten values of a in this interval 
randomly and run this test on each number a. If the condition of the 
test is not satisfied for any value of a, then we again know immediately 
that n is not prime. Suppose, however, that the condition is satisfied for 
all ten values of a. Then we clearly will be inclined to conclude that n is 
prime. The probability that this conclusion is wrong is less than (|) 10 , 
which is slightly less than one in a million. With fifteen values of a we 
can improve the odds to less than one in a billion, and with twenty to 
less than one in a trillion! 


Can n Divide 2"-l or 2''+l ? 

There are infinitely many pseudoprimes, that is, numbers n such that 
«|2" - 2. Recall that a Mersenne prime is a prime number of the form 
2 n - 1. It is therefore quite natural to ask whether there are any integers 
n such that «|2" - 1. One might also ask a similar question: can n divide 
2 n + 1 ? 

It is not hard to come up with a conjecture that there are infi- 
nitely many values of n such that n|2” + 1. For example, since 3|2 3 + 
1 = 9, and 9|2 9 + 1 = 513, one suspects that this may be true 
for all powers of 3: 3, 9, 27, 81, . . . . You are asked to verify this in 
Problem 9.12. 

On the other hand, for numbers of the form 2” - 1, which we 
call Mersenne numbers, it is never the case that n\2" - 1 for n > 1. 
Let’s prove this fact. Suppose, by way of contradiction, that n is such 
a number; that is, suppose that n > 1 and that w|2" - 1. Clearly, n 
is odd. 

Let p be the smallest prime in the prime decomposition of n. Since n is 
odd, we know that p > 2. Also, by Fermat’s little theorem, we know that 
p 1 2 p ~ 1 - 1 . Next we let b be the least positive integer such that p \ 2 b - 1 . 
Sol<fc<p-l. 

We claim that b\n, which would contradict the choice of p as the 
smallest prime divisor of n. Write n = qb + r with 0 < r < b. Then, 
modulo p, we get 


0 = 2 n - 1 = 2 qb+r - 1 = (2 b ) q 2 r - 1 = (l) q 2 r - 1 = 2 r - 1 (mod p). 
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Therefore, r = 0; otherwise, we have 0 < r < b and p \2 r - 1, contradict- 
ing the choice of b. Thus b\n, as claimed, and the proof is complete. 


Mersenne Primes 

We now turn our attention to prime numbers that have a very particular 
form, namely, the form 2" - 1, the Mersenne primes. As we discussed 
in Chapter 5, these primes were originally of special interest because 
of their deep connection with perfect numbers. These days, however, 
our interest in Mersenne primes has much more to do with the fact 
that they are far better candidates for primeness tests than your average 
integer. But, first, let’s put Mersenne primes once again in their proper 
historical context. 

Euclid proved that if the Mersenne number 2" - 1 is prime, then 
2»-i pi n - 1) is perfect. This is Theorem 5.4. In Problem 5.14 we also saw 
Fermat’s result that the Mersenne number 2" - 1 can be prime only if 
n is prime. Euler proved that all even perfect numbers have exactly this 
form. The following proof of Euler’s result is based on a proof given by 
L. E. Dickson in 1911. 


Theorem 9.1 . Ifn is an even perfect number, then n — 2 P 1 (2 1 ’ - 1) for some 
prime p such that 2 P - 1 is prime. 


Proof 

Recall that a positive integer n is perfect if a(n) = 2 n, where a(n) is the 
sum of the positive divisors of n, and that in Problem 7.31 we verified 
that a (ri) is a multiplicative function. 

Since n is even, we write n = 2 k ~ 1 m, where m is odd and k > 1. We 
need to show that m = 2 k - 1 and that m is prime. 

Since n is perfect, we can write 

2 k m = 2 n = a (n) = <y(2 k ~ 1 m) = a( 2 k ~ l ) cr(m ) = ( 2 k - l)cr(m). 

Note that this implies that 2* - 1 divides m; therefore, m > 1. 

We will show that m is prime by showing that it has only two divisors: 
m and 1. Let d be the sum of all the proper divisors of m; that is, d is the 
sum of all the divisors of m other than m itself. Thus, o(m) = d + m. 
Then, we can write 


2 k m = ( 2 k - l)er(m) = ( 2 k - 1 ){d + m) = 2 k m + (2 k d - d - m). 
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So 


(2* - 1 )d = m, 


and d is a proper divisor of m. But d is also the sum of all the proper 
divisors of m. The only way this can be true is if d = 1 , which means that 
the only divisors of m are 1 and m. Therefore, m is prime, and m = 2 k — 1 . 
Finally, by Fermat’s result in Problem 5.14, 2* - 1 being prime implies 
that k is prime. This completes the proof. ■ 


Theorem 9.1 is an extraordinary result. It completes— some two 
thousand years later— one of the most beautiful theorems in Euclid’s 
Elements, Theorem 5.4, by telling us exactly what even perfect numbers 
look like. 

Nonetheless, Theorem 9.1 still leaves open the oldest unsolved prob- 
lem in all of mathematics: 


Is there an odd perfect number? 

This question was explicitly raised by Descartes in a letter to Mersenne 
in 1638, but surely such an obvious question would have occurred to 
Euclid and others as well. So far, no odd perfect number has been 
discovered. And, since 1991, it has been known that any odd perfect 
number would have to be greater than lO 300 . There is even an ongoing 
project to increase this lower bound (see http://www.oddperfect.org/). 

On the other hand, if an odd perfect number does exist, we know a 
lot about what it must look like. For example, Euler proved that if n is 
an odd perfect number, then n = pm 2 , where p is a prime of the form 
4k + 1 (see Problem 9.16). It had been known since 1925 that an odd 
perfect number would have to have at least six distinct prime factors, 
and in his 1972 Ph.D. dissertation Carl Pomerance proved that an odd 
perfect number would have to have at least seven distinct prime factors. 
We now know that there must be at least seventy-five prime factors, 
of which at least nine must be distinct. More significantly, however, 
Carl Pomerance has given a highly convincing probabilistic argument, 
called the Pomerance heuristic, to support the conjecture that there are 
no odd perfect numbers. 

Meanwhile, the interest in Mersenne primes has shifted away from 
their relationship to perfect numbers and now resides entirely in the 
fact that the largest known primes are all Mersenne primes. There are 
good reasons for this, having to do with the form of the Mersenne 
numbers: 2" - 1 . 

As an example, let’s show how easy it is to discover that 2 23 - 1 is not 
a Mersenne prime (and, in fact, n = 23 did not appear on Mersenne’s 
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original list of values for which 2" - 1 is prime). Note that, by Fermat’s 
little theorem, 47|2 46 - 1 . So 47|(2 23 - 1)(2 23 + 1). Thus 47 must divide 
2 23 - 1 or 2 23 + 1 . It is now easy to see that 47 1 2 23 - 1 . This can be done 
with a calculator, or even by longhand since 2 23 - 1 = 8 388 607. Or, 
better yet, we can use fast exponentiation, modulo 47, since 2 s = 32, 
and so 2 n = 2 • (32) 2 = 2048 = 27, which means that 2 23 = 2 • (27) 2 = 
1458 = 1; hence 47|2 23 - 1. 

You may recall from Chapter 5 that this is the way Fermat found a 
factor for 2 37 - 1. There, by Fermat’s little theorem, we concluded that 
a prime factor must be of the form 74k + 1, so it was a simple matter of 
trying possible prime divisors: 149, 223, 593, . . . , until a divisor, 223, 
was found that worked. In general, for an odd prime p, any prime factor 
of the Mersenne number 2 P - 1 must be of the form 2kp + 1 . 

Gradually, Mersenne’s original list was corrected and his five mis- 
takes were discovered. Mersenne missed three numbers: with heroic 
amounts of calculation, using essentially Fermat’s approach, it was 
shown in 1883 that 2 61 - 1 is prime, in 1911 that 2 89 - 1 is prime, and in 
1914 that 2 107 - 1 is prime. 

Mersenne was also wrong about two of the numbers on his list: 
67 and 257. We already told the famous story of Frank Nelson Cole’s 
factorization of 2 67 - 1. In 1930, the American mathematician D. H. 
Lehmer proved that 2 257 - 1 is not prime, although it would not be until 
1979 that a prime factor could actually be found using a computer. So 
how could Lehmer prove that 2 257 - 1 is composite? 

In 1876, Edouard Lucas— who in 1877 would verify that 2 127 - 1 is 
prime— developed a method for testing Mersenne numbers, a method 
that was later improved by Lehmer. This is now called the Lucas-Lehmer 
test : 

Define a sequence of numbers by 

(4, if i=0, 

Si “ {si, - 2, if i > 0; 

that is, 4, 14, 194, 37634, 1 416 317 954, .... Then, for an odd 

prime p, the Mersenne number 2 p -l is prime if and only if 

Sp —2 = 0 (mod 2^-1). 

(For a short proof of the Lucas-Lehmer test requiring only a minimal 
amount of group theory, see J. W. Bruce, “A Really Trivial Proof of the 
Lucas-Lehmer Test,” The American Mathematical Monthly 100(4) (April 
1993), 370-71.) 

For example, to test the Mersenne number 2 7 - 1 (not that there is any 
doubt about it being prime), we would reduce this sequence, modulo 
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2 7 - 1 = 127, and get 

4, 14, 67, 42, 111, 0, . . . ; 

that is, Ss = 0 (mod 127), which tells us that 2 7 — 1 is prime. 

To test the Mersenne number 2 11 - 1 (not that there is any doubt 
about it being composite), we would reduce this sequence, modulo 
2 U - 1 = 2047, and get 

4, 14, 194, 788, 701, 119, 1877, 240, 282, 1736, . . . ; 

that is, Sg ^ 0 (mod 2047), which tells us that 2 11 - 1 is composite. 

The efficiency of the Lucas-Lehmer test and modern computers have 
made it possible to find ever larger and larger Mersenne primes, and 
thus to find ever larger and larger primes. It is not actually known 
whether the number of Mersenne primes is infinite, but the search for 
the “next” Mersenne prime goes on as if the answer to this important 
open question is an obvious and resounding yes. GIMPS — the Great 
Internet Mersenne Prime Search — was begun in 1996 and uses the 
Lucas-Lehmer test to find Mersenne primes by coordinating the efforts 
of hundreds of individuals around the world, each working with his or 
her own computer. 

On January 25th, 2013, the largest known prime was discovered 
when it was verified that the Mersenne number 

257 885 161 _ ^ 

is prime. This number now ranks as the forty-eighth Mersenne prime 
and has over 17 million digits. The forty-seventh Mersenne prime 

243 112 609 _ 

had been found in 2008 and has almost 13 million digits. The team 
that discovered this prime was awarded a $50,000 prize for finding 
the first prime with more than 10 million digits. You can follow 
the progress of the search for Mersenne primes and rumors of a 
prize of $150,000 for a prime with more than 100 million digits at 
www.utm.edu/research/primes/mersenne.shtml. 


Problems 

9.1 * (H) A number such as n= 1817 is small enough that it is easy to factor 
simply by trying primes 2, 3, 5, 7, 11, 13, . . . , in order until we find a 
factor. However, this is also a good number with which to practice the 
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quadratic sieve method. Factor 1817 using the quadratic sieve method. 
Be sure to explicitly use the Euclidean algorithm even if at the end you 
can spot the gcd by inspection. 

9.2 (S) In both examples of the quadratic sieve method we looked at in this 

chapter, something very nice occurred during the last step. In the first 
example, we found that 194 34 = 0 (mod 1649), and in the second, 
12128 • 25842 = 0 (mod 27667). In each case, the gcd of each factor 
with n was greater than 1 and, hence, produced a factor of n, which is 
what we were looking for. In this problem, however, you will see that 
we are not always so lucky. 

Use the quadratic sieve method to factor n = 2921. In particular, use 
a mesh size for your sieve of 50 to filter the numbers 55 2 - 2921 
through 66 2 - 2921 . Now, among the numbers that successfully pass 
through this filter, find a combination of numbers that collectively 
produces a square. What surprising thing happens next that prevents 
you from being able to use this combination to factor 2921? 

Then — and this is what we always do in the quadratic sieve method 
when faced with this unpleasant surprise — simply continue on with 
the numbers 55 2 - 2921, 56 2 - 2921, ... , as if nothing at all had 
happened by computing 67 2 - 2921, and so on, looking for another 
combination that produces a square. 

9.3 (S) Use Fermat’s little theorem together with fast exponentiation to test 

the number 9997 for primeness. 

9.4 * (S) Compare how long it takes you to test the number 1 000 283 for 

primeness by the following two methods: 

Method 1 : Divide the number 1 000 283 by each of the primes 


2, 3, 5, 7, 11, . . . 


up to s/\ 000 283. In this case, don’t actually do all of the 
computations, but instead estimate how long it would take you 
in the following way. 

Divide 1 000 283 by enough primes, say ten to twenty, in order to find 
out how long on average it takes you to test each prime as a potential 
divisor. Then use the prime number theorem to figure out how many 
primes you might need to test, and how long it would take you, to do 
the number 1 000 283. 
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Method 2: Use Fermat's little theorem together with fast 
exponentiation. In this case, too, don’t actually do all of the 
computations, but instead estimate how long it would take you in the 
following way. 

First, note that in binary we can write 

1 000 283 = 11 110 100 001 101 011 Oil. 

Therefore, we need to compute each of 
2 1 2 3 2 7 2 1S 2 30 2 61 2 122 2 244 2 125 035 2 250 070 2 500 141 2 1 000 283 
modulo 1 000 283. 

Now, beginning with 2 15 = 32 768, compute the next several powers 
of 2 modulo 1 000 283 in order to find out how long on average it takes 
you to compute each power of 2. Then figure out how many 
computations, and how long, it would take you to reach 2 1 000 283 . 

9.5 (H) Recall Euler’s formula from Problem 7.42 that combines the three 

numbers e, i, and n in a single formula. A similar formula combines the 
three important number-theoretic functions a, cp, and r that can be 
used as a test for primeness. Unfortunately, like the converse to 
Wilson’s theorem, this test is of no practical value. Nonetheless, it is 
still a remarkable formula. 

Prove that a positive integer n is prime if and only if 

a {n) + <p(ri) = n-r(n) . 

9.6 (H,S) Recall that a composite positive integer n such that n\(2 n - 2) is called 

a pseudoprime. For example, 341 is a pseudoprime, as we saw in 
Problem 5.9. 

Another pseudoprime is 2047 = 2 11 — 1. We saw that 2047 is 
composite in Problem 5.16 since 2047 = 23 • 89. We can show that 
2047|(2 2047 - 2) by showing that 2047|(2 2046 - 1). This turns out to be 
easy to do because 11 (2046, and so we can write 

2 2046 _ 2 = ( 2 11 _ 1)(2 2035 + 2 2024 + 2 2013 + hi). 

Therefore, (2 11 - 1)|(2 2046 - 1); that is, 2047|(2 2046 - 1). 

(a) For a prime p, if the number 2 P — 1 is prime, then we call 2 P - 1 a 
Mersenne prime. Prove that if, on the other hand, p is prime and 
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2 p - 1 is composite, then 2 p - 1 is a pseudoprime (for example, 
p = 11 is prime but 2047 = 2 11 - 1 is composite). 

(b) Prove that there are infinitely many pseudoprimes by showing 
that if n is a pseudoprime, then so is 2 n - 1 . 

(c) Note that the pseudoprime 341 can be factored as 

341 = 11 • 31 = (~tl)(2 s - 1). Prove that there are infinitely 
many pseudoprimes by showing that if p > 3 is prime, then 
n — (^-^)(2 P - 1) is a pseudoprime. 

9.7 (H) In 1877, Lucas found a pseudoprime 2701 = 37 • 73. Prove that if 

n = pq, where p is a prime of the form p = 4k + 1 such that q = 2p - 1 
is also prime, then n is a pseudoprime (for example, n = 97 • 193 is also 
a pseudoprime). 

9.8 (H) Recall Fermat’s famous conjecture that all numbers of the form 

2 2 "+ 1— that is, all Fermat numbers— are prime. While it is true that the 
first five of these numbers are prime (that is, for n = 0, 1, 2, 3, 4), it 
seems quite likely that the rest of these numbers are composite (for 
example, it is known that F n is composite for 5 < n < 32; complete 
factorizations are known for these numbers up to F n , and at least one 
factor is known for the rest except for F20 and F24, the latter number 
having more than five million digits). Prove that any composite Fermat 
number is a pseudoprime. 

In fact, this result might even be seen as a partial defense on behalf 
of Fermat for his misguided conjecture. Since the first pseudoprime was 
not discovered until 1819, it is certainly possible that Fermat may well 
have been aware of the fact that for any of the numbers F n = 2 2 "+ 1 
that F„\(2 f "— 2), and that he justifiably took this as extremely strong 
evidence of his conjecture that Fermat numbers are prime. 

9.9 (H) It was not until 1950 that the first even pseudoprime 161 038 was 

discovered by D. H. Lehmer. The following year, the Dutch 
mathematician N.G.W.H. Beeger proved that there are infinitely many 
even pseudoprimes. 

Prove that 161 038 is a pseudoprime by using the prime 
decompositions of the numbers 161 038 and 161 037, 

161 038 = 2 • 73 ■ 1103, 161 037 = 3 2 • 29 • 617, 

and using the method we used in Problem 5.9 to show that 341 is a 
pseudoprime. 

9.10 (H,S) It was easy to show that 561 is an absolute pseudoprime because the 

number n = 561 has the property that n has a prime decomposition 
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n = pqr into a product of three distinct primes p, q, and r such that 
p — 1 , q — 1 , and r - 1 each divide n — 1 . Look back at the proof on page 
271 to see that it is in fact this property of the number 561 that makes 
that proof work. 

(a) Verify that the other six absolute pseudoprimes less than 10 000 
have this same property. Here are their prime factorizations: 

1105 = 5-13-17 1729 = 7-13-19 2465 = 5-17-29 

2821 = 7-13-31 6601 = 7-23-41 8911 = 7-19-67. 

(b) It is not at all critical that an absolute pseudoprime having this 
property be a product of three distinct primes. Show that 

62 745 = 3 ■ 5 • 47 • 89 is an absolute pseudoprime for the same 
reason. 

(c) Find the smallest absolute pseudoprime that is a product of four 
distinct primes. 

(d) Find the smallest absolute pseudoprime that is a product of five 
distinct primes. 

(e) The absolute pseudoprime 1729 happens to be an extremely 
famous number, not because it is an absolute pseudoprime, but 
because of a story that involves a London taxicab whose license 
number was 1729. 1 will tell this memorable story at a more 
appropriate time in Chapter 15. 

However, there is a highly suggestive pattern in the prime 
factorization of 1729 : 

1729 = (6 + 1)(12 + 1)(18 + 1). 

Find another absolute pseudoprime with this same form in its prime 
factorization : 


n = ( 6 *+l)( 12 *+l)( 18 *+l). 

Then see if you can find any absolute pseudoprimes having a similar 
pattern constructed from four primes. 

9.11 (H,S) (a) Run the probabilistic test for primeness on n = 561 for the value 
a = 2. What do you conclude? 

(b) Run the probabilistic test for primeness on n = 561 for the value 
a — 50. What do you conclude? 
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(c) Verify that if for a large value of n the condition of the 

probabilistic test is satisfied for ten randomly chosen values of a, 
then the odds that n is composite are less than one in a million; 
and, similarly, if twenty values of a are used, the odds are less 
than one in a trillion. 

9.12 (H,S) Use induction to prove that n\2 n + 1 for n = 3, 9, 27, 81 

9.13 You may have noticed that the perfect numbers 6 and 28 are triangular 
numbers. The perfect number 496 is also triangular since 496 = ^4^. 
Use Theorem 9.1 to prove that every even perfect number is a 
triangular number. 

9.14 (H,S) Restate Theorem 9.1 by giving a description of the binary 

representation of an even perfect number. 

9.15 (H,S) Theorems 9.1 and 5.4 tell us that the sequence of Mersenne primes 

2 2 - 1 = 3, 2 3 - 1 = 7, 2 5 - 1 = 31, 2 7 - 1 = 127, . . . , (which may or 
may not be infinite) produces the sequence 6, 28, 496, 8128,. . . , of 
even perfect numbers. When looking at this sequence it is tempting to 
conjecture that every even perfect number ends in either a 6 or an 8, 
and in fact it is even tempting to conjecture that the even perfect 
numbers end alternately in 6 and 8. Prove or disprove these two 
conjectures. 

9.16 (S) Prove that if n is an odd perfect number, then n = p 4 i +1 trp-, where p is a 

prime of the form 4k + 1 and p does not divide m. 

9.17 (H) Prove that for a perfect number, the sum of the reciprocals of its 

divisors equals 2. For example, for n = 6, we get 

Vi-i + ± + i + i- 2 

d\6 

9.18 (H,S) A perfect number by definition is a number such that a (n) = 2n. In 

the ninth century, Thabit ibn Qurra defined an abundant number to be 
a positive integer n for which a(n) > 2 n and a deficient number to be 
one for which a(n) < 2 n. There are infinitely many abundant numbers 
since 12, 24, 48, 96. ... , are all abundant numbers, and there are also 
infinitely many deficient numbers since for any odd prime p the 
numbers p, p 2 , p 3 , . . . , are all deficient numbers. 
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(a) Show that if k > 2, then the number n = 3 • 2 k is abundant. 

(b) Show that if p is an odd prime and k > 1, then the number 
n = p k is deficient. 

(c) If you look at the table below, it is tempting to conjecture that all 
odd numbers are deficient. You should be very suspicious about 
this conjecture, however, since if it were true, then we would 
know that no odd perfect numbers exist. Show that 945 is not 
deficient. It is the smallest abundant odd number. 


n 

i 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

cr(n) 

l 

3 

4 

7 

6 

12 

8 

15 

13 

18 

12 

28 

14 

a {n) 
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3 

4 

7 

6 

2 

8 

15 

13 

9 

12 

7 i 

14 

n 

i 

2 

3 

4 

5 

1 

7 

8 

9 

5 

11 


13 


In 1944, Paul Erdos and Leon Alaoglu came up with a 
somewhat more careful way of describing that a number is 
highly divisible. They defined a positive integer n to be 
superabundant if 


a in) a(k) 

> 

n k 

for all k < n. The table above shows ^ for the first few values of 
n. We see that among these numbers 2, 4, 6, and 12 are 
superabundant. 

(d) Now, as d ranges over all the positive divisors of a positive 

integer n, then 'j also ranges over the positive divisors of n (in the 
opposite order). Thus, for the function a{ri), we can write 

d\n d\n 

and so, dividing by n, we get 

g(») _ 1 

n / j d ' 
d\n 

For example, for n = 4, we get 

cr( 4) _T1_1 , I , i_7 
4 d ~ 1 ' 2 1 4 ~ 4' 

d |4 

Use this representation of as the sum of the reciprocals of 
the divisors of n to prove that there are infinitely many 
superabundant numbers. 
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9.19 (H,S) You might have noticed in the table in Problem 9.18 that cr(12) = 28 

is a perfect number; that is, 6 is a perfect number and so is 
a (cr (6)) = 28. Find all perfect numbers n such that tr (cr(n)) is perfect. 

9.20 * (S) We have seen how using the idea that for an odd prime p, any prime 

factor of the Mersenne number 2 P - 1 must be of the form 2 kp + 1 
greatly reduces the number of potential divisors that must be checked 
when testing 2 P - \ for primeness. We can further reduce the number 
of divisors to be checked as follows. 

(a) Use Euler’s criterion (Theorem 8.1) and Theorem 8.3 to prove 
that if q is a prime factor of the Mersenne number 2 P - 1 for an 
odd prime p, then q = ±1 (mod 8). 

(b) Verify that 2 19 - 1 = 524 287 is a Mersenne prime by first 
reducing as far as possible the number of potential prime 
divisors that need to be checked. 

(c) Verify that 2 43 - 1 = 8 796 093 022 207 is not a Mersenne prime. 
How many potential prime divisors do you need to check before 
you find one that divides 2 43 - 1? 

9.21 (H,S) Prove that the Mersenne number 2 2 " - 1 has at least n distinct prime 

divisors. 

9.22 (H) Prove that if m is an odd positive integer, then 2" ! - 1 is relatively 

prime to 2" + 1 for any positive integer n. 

9.23 (S) Use Mersenne numbers to give an alternate proof of the infinitude of 

the prime numbers by showing that if p is a prime number and q is a 
prime that divides the Mersenne number 2 P - 1, then q > p. 

9.24 ★ Use the Lucas-Lehmer test to verify that 2 13 - 1 is a Mersenne prime. 

9.25 ★ Use the Lucas-Lehmer test to verify that 2 23 - 1 is composite. 

9.26 (S) The real advantage of the Lucas-Lehmer test for primeness becomes 

apparent only when testing really large numbers by computer. 
However, we can begin to get a sense of its efficiency by considering the 
relatively small composite number 

2 59 - 1 = 576 460 752 303 423 487. 

(a) Estimate how many steps you would have to do using the 
methods of Problem 9.20 before you discovered that 2 59 - 1 is 
not a Mersenne prime. 
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(b) Roughly how many steps would you have to do using the 
Lucas-Lehmer test to discover the same thing? 

9.27 (H) In Chapter 5 we saw how Euler used the idea that any prime divisor p 
of the Fermat number F n = 2 2 " + 1, for n > 2, is of the form 
p — 2 n+1 ■ k + 1 for some positive integer k in order to factor the Fermat 
number F 5 (see Problem 5.24). So, to find a prime divisor of 
F 5 = 2 32 + 1 all he had to do was try primes of the form p = 64 k + 1 
until he found one that divided P 5 (which happens at k = 10 ). 

Similarly, in 1880, when Landry factored the Fermat number P 6 , he 
was aware that a prime divisor p of F 6 = 2 64 + 1 must have the form 
p = 128 k + 1 . Since Landry would not have found a prime divisor until 
he tried k = 2142, we discussed in Chapter 5 how he might have used a 
method very much like the sieve of Eratosthenes (see Problem 8.3) to 
avoid a lot of unnecessary trial division. 

Note that in both of these cases the value of k is even for a prime 
divisor p of the Fermat number. This turns out to always be the case; in 
other words, there is never any need to try an odd value of k to see if a 
prime p = 2 ” +1 • k + 1 divides F„ = 2 2 " + 1 . This very handy fact was 
proved in 1879 by Edouard Lucas. 

Thus Lucas proved that, for n > 2, any prime divisor p of the Fermat 
number F n = 2 2 " + 1 is of the form p = 2 II+2 - k + 1 for some positive 
integer k. Thus, for example, all six of the known prime divisors of F \ 2 
are of the form p = 2 14 - k + 1 = 16 384 k + 1, the smallest being 
p = 2 14 • 7 + 1 = 114 689, which was discovered in 1877 by Lucas 
himself. 

Prove the Lucas result that, for ti > 2, any prime divisor p of the 
Fermat number F„ = 2 Z "+ 1 is of the form p = 2 n+2 - k+1 for some 
positive integer k by filling in the details for the following steps. 

(a) Note that p must be an odd prime. 

(b) Show that 2 2 " +1 = 1 (mod p). 

(c) Then prove that the order of 2 modulo p is 2 ,,+1 . (Recall from 
Chapter 7 that the order of 2 modulo p is the least positive 
integer r such that 2 r = 1 (mod p).) 

(d) Use Corollary 1 of Fermat’s little theorem, and Theorem 8.3, to 
show that (|) = 1 . 

(e) Finally, use Euler's criterion, Theorem 8.1, as well Corollary 1 of 
Fermat’s little theorem again to conclude that 2' ,+2 | p - 1; hence 
p = 2"+ 2 • k + 1 as claimed. 
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In the previous chapter on prime numbers we touched on the question 
of how to factor a large number into a product of primes, and also on 
the somewhat easier question of how to determine whether a number is 
prime or composite. In this chapter we turn to the third main topic con- 
cerning prime numbers: How are they distributed among the integers? 
We begin with a simple observation about the unpredictable manner in 
which primes spread themselves among the integers. 


Gaps Both Large and Small 

It is not at all hard to find arbitrarily large gaps in the sequence of 
prime numbers — that is, to find arbitrarily long strings of consecutive 
composite numbers. In fact we can produce n consecutive composite 
numbers by using the factorial function as follows: 

(n + l)! + 2, («+ 1)1 + 3, (n+ 1)! + 4, (n+l)! + 5, .... («+ 1)1 + («+ 1). 

So, for example, for n = 10, we get 11! + 2 = 39 916 802, and we know 
that the ten consecutive numbers 

39 916 802, 39 916 803, 39 916 804. 39 916 805, ... , 39 916 811 

will all be composite because the first number has to be divisible 
by 2, the second by 3, the third by 4, and so on, until the last number 
is divisible by 11. Or, more ambitiously, if we want a million consecutive 
composite numbers, we let n = 1 000 000, and produce 

1 000 0011 + 2, 1 000 0011 + 3, ... , 1 000 0011 + 1 000 001. 

This certainly supports the idea that, as we expect, the primes do get 
farther apart as we get further out in the sequence of integers. 

On the other hand, and this is really quite surprising, it also seems 
that extremely small gaps between primes continue to mysteriously 
persist as we get further out in the sequence of integers. In fact, other 
than the unique gap between the even prime 2 and the first odd prime 3, 
the smallest possible gap, a gap of size 2 — such as the gap between 17 
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and 19, or 59 and 61, or 1 000 000 000 061 and 1 000 000 000 063— 
seems to continue to persist no matter how far out we go in the sequence 
of primes. 

In spite of the fact that the prime number theorem tells us that the 
primes get more and more rare as we go further and further out in the 
sequence of integers, every once in a while we come across a pair of 
primes that are as close to one another as they can possibly be. Such 
a pair of primes— that is, two primes p and p + 2— are called twin primes. 


The Twin Prime Conjecture 

One of the great joys of number theory as a subject is that it has pro- 
vided many interesting questions that are simple to pose and yet remain 
unanswered after many years, in spite of enormous effort by mathe- 
maticians throughout the world. The twin prime conjecture is among 
the most famous unsolved problems in mathematics and proposes an 
answer to one such question: Are there infinitely many primes p such 
that both p and p + 2 are prime? This conjecture optimistically claims 
that yes indeed the sequence of prime numbers never runs out of such 
twin primes. 

Overwhelming data support the twin prime conjecture. For example, 
to find all 35 pairs of twin primes below 1000, and all 8169 pairs below 
1 000 000, is relatively straightforward, and at this time all twin primes 
below 1 000 000 000 000 000 000 have been found! Here is a pair of 
twin primes with more than a hundred thousand digits each: 

65 516 468 355 • 2 333 333 - 1 and 65 516 468 355 • 2 333 333 + 1. 

Yet a proof of the twin prime conjecture is nowhere in sight. 

On the plus side, however, it was proved in 1966 that there are 
infinitely many pairs p and p + 2 such that p is prime and p + 2 has 
at most two prime factors. Problem 10.4 also provides strong support 
for the twin prime conjecture. 

A result known as Brun's theorem tells us something significant about 
the way in which twin primes thin out as we go further and further 
out in the sequence of integers. This theorem, proved in 1915 by the 
Norwegian mathematician Viggo Brun, says that the series consisting 
of the sum of the reciprocals of the twin primes converges; that is, the 
series 
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converges. For a proof of Brun’s theorem see W. J. LeVeque, Fundamen- 
tals of Number Theory (New York: Dover, 1996). 

It is natural to extend the idea of twin primes and to ask whether 
there are any prime triples, that is, any set of three consecutive odd 
numbers n, n + 2, and n + 4 that are all prime. Obviously, the numbers 
3, 5, and 7 form such a triple. But, since for any three consecutive 
odd numbers one of the three numbers must be divisible by 3, the set 
{3, 5, 7} is the only possible such prime triple (see Problem 10.5). 

Since looking for prime triples of the form p, p + 2, and p + 4 turns 
out to be not very interesting, it has instead become standard to call 
three numbers either of the form p, p + 2, and p + 6 or of the form 
p, p + 4, and p + 6 prime triplets if all three numbers are prime. The 
motive for this definition is that— except for the set (3, 5, 7} — this is as 
close as three odd primes can be. Some examples of prime triplets are 
{5, 7, 11), {7, 11, 13), {17, 19, 23), and {37, 41, 43}. The most obvious 
question about prime triplets is whether there are infinitely many. The 
largest know prime triplet was found in 2012 and consists of the three 
numbers 81 505 264 551 807 • 2 33444 - 1, 81 505 264 551 807 • 2 33444 + 1, 
and 81 505 264 551 807 ■ 2 33444 + 5, each having 10 082 digits. 


The Series 

Brun’s theorem, which says that the series of reciprocals of twin primes 
converges, contains very useful information about the relative density 
of twin primes among the integers. Of course, if the twin prime conjec- 
ture turns out to be false, and there are only finitely many twin primes, 
then Brun’s theorem is completely trivial since this series has only a 
finite number of terms. But if the twin prime conjecture is true (and 
this seems highly likely), then the fact that this series converges tells us 
that twin primes do indeed get relatively sparse as they thin out among 
the sequence of integers. 

Let’s now look at the series consisting of the sum of the reciprocals of 
all the primes, that is, the series J2p j where the sum is taken over all 
primes p. In 1737, Euler proved that this series diverges (showing once 
again that there are infinitely many primes!). This immediately tells us 
something very important about the sequence of primes: In spite of the 
obvious fact that they do tend to thin out, they still manage to remain 
relatively dense among the sequence of integers. 

We now present two very different proofs of Euler’s result. The first 
is a fairly straightforward attack on the problem. We will show that 
the sequence of partial sums is always at least as big as a function 
that is obviously growing to infinity. In particular, we will show that 
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Hp<x p > 2 ln(ln x) where it is clear that the latter function goes to 
infinity as x goes to infinity. Hence the partial sums are necessarily 
pushed along to infinity. 

Recall from Problem 5.18 that, for a geometric series where |r| < 1, 
we have 1 + r + r 2 + r 3 + • • • = So, for a prime p, this becomes 


Thus 


, 1 1 

1 H 1 y + 

P P 2 




n 

p<x 



-n(* 

p<x 


i i 

P P 2 


- + n 2 + „3 


1 

P 3 


Now, the product on the right will produce the reciprocal for any 
k < x, so 


n 

p<x 


1 




1 

k' 


At this point we will use an idea from calculus and think about the 
circumscribed rectangles above the function f{t) = 1 / 1 from 1 to [xj + 1 . 
Since J2k = i i represents the circumscribed rectangles over this interval 
and we can use an integral to represent the area under the curve over 
this interval, we have 


X 


E 


i 

k 


> 


"W+l 


- dt > In v. 
t 


We conclude that Y\ p<x > In x, or, taking reciprocals, that 


n (i 

p<x 



l 

In x' 


In a moment we will take the logarithm of both sides of this expres- 
sion (in order to turn this product into a sum since the “log of a product 
is the sum of the logs”) so we will have a In (1 - i) term to deal with. 
Since our ultimate goal is to find some function — any function — that 
goes to infinity and is less than J2p<x y, > we can gi ye a little ground here. 

So, we note that p > 2, which means that — 1 < — t < 0, and we 
can make use of the fact that the function y — ln(l + x) lies above 
the function y = 2x over this interval (this can be verified several 
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ways but you should do so with a graphing calculator or computer; see 
Problem 10.10). Hence In (1 - j) > —j. 

We are now ready to take the logarithm of both sides of the expres- 
sion above involving the product, and we get 



which, using properties of logarithms and inserting our inequality, 
becomes 



Finally, we divide by -2, and we have found the function we were 
looking for: 

£ - P > \ ln(ln *>• 

p<x r 

Since the function \ ln(ln x) goes to infinity as x goes to infinity, the 
series \ diverges, and this completes our first proof. 

Our second proof of this fact about the sequence of primes is due to 
Paul Erdos, who was perhaps one of the greatest, and certainly one of 
the most prolific, mathematicians of modern times, publishing papers 
in many areas of mathematics, including number theory, combina- 
torics, set theory, probability, and analysis. Erdos was born in Budapest, 
Hungary, in 1913; he published his first mathematical paper at the age 
of nineteen, and earned his doctorate by the time he was twenty-one. 

Throughout his long career he was always traveling, from conference 
to conference, university to university, staying in the homes of his 
many collaborators around the world. All too frequently he would burst 
into the bedroom of his startled host in the middle of the night fully 
expecting to be able to resume a discussion of a mathematical problem 
they had been working on earlier in the evening. Paul Hoffman quotes 
one of these colleagues, Joel Spencer, in his excellent biography of 
Erdos, The Man Who Loved Only Numbers: 

Mathematical truth is immutable; it lies outside physical reality 
.... This is our belief; this is our core motivating force. Yet our 
attempts to describe this belief to our nonmathematical friends 
are akin to describing the Almighty to an atheist. Paul embodied 
this belief in mathematical truth. His enormous talents and 
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Figure 10.1 Paul Erdos on June 12, 1991, as he prepares to give the Paul Erdos 
Honorary Lecture at Trinity College, Cambridge University. Photo by George 
Csicsery from his film "N Is a Number: A Portrait of Paul Erdos." © 1993. All 
rights Reserved, www.zalafilms.com. 


energies were given entirely to the Temple of Mathematics. He 
harbored no doubts about the importance, the absoluteness, of 
his quest. To see his faith was to be given faith. The religious 
world might better have understood Paul’s special personal 
qualities. We knew him as Uncle Paul. 


Here is Erdos’s proof that the series Y. p j, diverges. Since each term 
in this series is positive, this series must either converge or diverge to 
infinity. So, by way of contradiction, we will assume that the series J2p p 
converges to some real number r. As the series gets closer and closer to 
r, it must at some stage eventually become greater than r - that is, 
there is an integer k such that 


E 


1 

Pi 


> r 


1 

2 ’ 


where pi. Pi. pz are the primes listed in increasing order. This 

means that the sum for the entire rest of the series after k — the part 
usually called the tail of the series — must he less than that is, 


E 

i=k + 1 


1 

Pi 


< 


1 

2 
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We will call these first k primes pi, P 2 , ■ ■ ■ , pk the really important 
primes, and the rest of the primes pk+\, pk+ 2 . ■ • • , the less important 
primes. 

For a given positive integer M it is clear that the M integers n such 
that 1 < n < M can be partitioned into two disjoint sets: those 
integers that are divisible only by really important primes, and those 
integers that are divisible by at least one less important prime. We 
will reach a contradiction by producing an integer M such that the 
total number of elements in these two sets does not equal M, which 
is impossible since each number from 1 to M must be in one set 
or the other. To achieve this surprising contradiction, we now let 
M = 2 2k+2 . 

For each prime pi there are exactly |_“J integers from 1 to M that are 
multiples of p, . Hence the number of integers from 1 to M divisible by 
at least one less important prime is at most 

V 2 - M 1 M 

/=*+ 1 ;=*+i P' 

so the number of integers from 1 to M divisible by at least one less 
important prime is less than y . 

Now consider ami < M that has only really important primes as 
its divisors. Write n — ab z where a is the squarefree part of n (see 
Problem 3.21). Since a must be a product of distinct really important 
primes, and there are only k of these primes to choose from, there are at 
most 2 k different possibilities for the squarefree part a of such a number 
n. Further, since b < ^fh < sfM, there are at most s/M possibilities for b. 
Thus the number of integers from 1 to M that have only really important 
primes as their divisors can be no greater than 

2 k s[M = 2 k s/2 zk + 2 = 2 k ■ 2 k+1 - 2 2k+1 = — 

2 

(at this point you can see why the particular value for the number M 
was chosen in the way it was). 

This is the contradiction we were seeking. For this integer, M = 2 2k+2 , 
all of the numbers from 1 to M have been partitioned into two sets, 
one with fewer than y elements and one with at most y elements! This 
completes the second proof, and we again can conclude that the series 
E P } diverges. 
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Bertrand's Postulate 

One of the most famous results concerning the distribution of primes 
is known as Bertrand’s postulate, though it is in fact a theorem. In 1845, 
Joseph Bertrand conjectured that for any natural number n there is a 
prime greater than n but no larger than 2 n. Moreover, he verified this 
conjecture for all integers up to 3 000 000. 

Bertrand’s postulate was first proved in 1852 by the Russian math- 
ematician Pafnuty Chebyshev. However, the proof we shall present of 
this theorem is essentially a proof that Paul Erdos wrote when he was 
only nineteen, and that was published in his very first mathematical 
paper in 1932. Erdos would eventually write more than fifteen hun- 
dred mathematical papers; this is more than anyone has written since 
Euler. 


Theorem 10.1 (Bertrand's postulate). For every integer n > l, there is a 
prime p such that 


n < p < 2 n. 


Proof 

Recall that in our first proof of the fact that the series of reciprocals of 
the primes diverges we found a function \ ln(ln a) that bounded the 
partial sums J2p<x j from below. We now do the same sort of thing, but 
bound partial products from above. 

Claim 1. For all real numbers x > 2, line* P < 1 • 

Proof of Claim. It is sufficient to prove the claim only for prime values 
of x. This is because if the claim is true for all prime values and if x > 2 
is a nonprime real number, we can let q be the largest prime less than x. 
Then Yl P < x P = U P < q P < ^ < 4 X ~K 

We now use induction to prove the claim for prime values of a. The 
first prime value is a: = 2 and in this case the claim is correct since 2 < 4. 
Next, assume a = 2m + 1 is a prime greater than 2 and that the claim is 
true for all prime values less than a. The idea will be to split the product 
Up< 2 m +1 P into two products, U P < m+ i p and n m+ i< p < 2m+ i P- 

Since m + 1 < a = 2m + 1 we can use induction to handle the first 
product by taking the largest prime q < m + 1 . Thus 

Rp = n P < 4 9 ” 1 < 4 (m+1) - 1 = 4 m . 

p<m+l p<q 
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The second product takes a bit more effort. First, we observe that 


n p * 

m+ 1 < p <2m+\ 


(2m + 1)! 
m\(m + 1)! 


2m + 1 
m 


because all the primes in the interval m+ 1 < p < 2m + 1 appear in 
the numerator but not in the denominator of this fraction. Next, recall 
from Problem 5.37 the property of Pascal’s triangle that the sum of the 
binomial coefficients in the nth row is 2"; thus 


2 m+ 1 
0 


+ ■ ■ • + 


2m + 1 
m 


+ 


(2m + 1\ 
V m+ 1 ) 


(2m + 1\ 
\2m + 1 ) 


22m+l 


But, in this case, the two middle terms, ( 2 '"^ 1 ) and C^+i)' are not 
only the largest terms in this row of Pascal’s triangle, they are equal! 
Therefore, 2( 2 "'^ 1 ) < 2 2m+1 , and so, < 2 2m = 4 m . We conclude 

that our second product 


n p * 

m+ 1< p <2m+l 


2m + 1 
m 


< 4" 


Now, putting these two products back together, we see that, for x — 
2 m+ 1, 


\p = ]Jp < 4 m ■ 4'" = 4 2 '” = 4^- x . 

p<x p<2m+l 


This completes the proof of Claim 1 . 

At this point we make another claim about products, this time with 
the products being greater than the function 4". 

Claim 2. For all n > 3, 

4" < (2 n) 1+ ^ . Up. Up 

-J 2n<p<\n n<p<2n 

Proof of Claim. We prove this claim by finding upper and lower bounds 
for the binomial coefficient ( 2 ") . For the lower bound we again use a row 
of Pascal’s triangle: 
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where the middle term ( 2 ") is the largest term in this row. Since ( 2 ") and 
(2’n) both equal 1, we can rewrite this as 


2 n 

1 


+ 


2 n 


+ 


2 n 
2n - 1 


+ 2 = 2 


2n 


Now there are 2 n terms on the left, and, since ( 2tt ) is the largest of these 
2 n terms, we can conclude that ( 2 ”) > |^. Thus, we have our lower 
bound for the binomial coefficient: 

4" (2n\ 

2n \n) 

Finding the upper bound takes far more ingenuity. (As you watch this 
proof unfold keep in mind that Erdos did this when he was nineteen.) 
We will build the upper bound by finding all possible prime factors of 
( 2 ") . But ( 2 ") = ^ so it is natural to start by thinking about how to 
count the number of times a prime p divides n\. 

As we saw earlier in this chapter, for a prime p, there are [|j integers 
from 1 to n that are multiples of p, so that gives us [~\ times that p 
divides n\. But some of those multiples are also divisible by p 2 , so we 
need to count those as well and get another [^J times that p divides n\. 
Then we count the multiples divisible by p 3 , and so on. Thus n\ contains 
a given prime p as a factor exactly ^2 k>1 times. 

Therefore, for a prime p, the number of times that p appears as a 
factor of ( 2 n tt ) = is exactly 



Note that each term in this sum is either 0 or 1 because ~ J < 1 ' 
and so p - 2|_^J < 2; hence \ - 2[j,\ < 2. 

We are almost ready to build our upper bound. For a prime p we 
have just seen that the largest power p k of p that divides ( 2n ) cannot 
be larger than 2 n. In particular, then, if p > ~j2ti, the prime p can divide 
( 2 ") at most once. So we could build a product in two pieces: one piece 
would simply be Tl~/ 2 n<p< 2 n P si nce over this interval this accounts for 
all the prime divisors of ( 2n ); the other piece would be over the interval 
p < \[2n where, since we don’t know exactly the largest power p k that 
divides ( 2 "), we replace that largest power by something slightly larger, 
namely, 2 n, and so this piece would be IIp<y 2 » 2n. 

However, what Erdos realized was that over the interval ln<p< n 
there are no primes p that divide ( 2 "). This is because for a prime p in 
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this interval, 3 p > 2 n, and so p appears exactly twice in the numerator 
of ( 2 ") — (as p and as 2 p) and also exactly twice in the denominator. 

Therefore, Erdos built his upper bound for the binomial coefficient 
using three pieces for the product: 

O i i>" n? n? 

p<V2n \/Zn<p<^n n< P^2n 

Note that the third piece of his product (and, hence, this proof) finally 
has something to do with Bertrand’s postulate! 

Now, we compare our lower bound with this upper bound and 
replace the first product in the upper bound by a gross overestimate 
( 2 n )%/ 2 « S i nce we don’t know how many of the numbers less than or 
equal to \[2 Ti are prime (so, as a worst case, we assume they all are). This 
yields 


4" < (2 h) iW5 " . Y\ p Y[p, 

s/2n<p<\n n<p<2n 


which is Claim 2. 

At this point in the proof we can use contradiction. So we assume 
Bertrand’s postulate is false and that there is an integer n > 3 for which 
there is not a prime p such that n < p < 2 n. For this integer n, the 
last product in the inequality of Claim 2 above disappears, and using 
Claim 1 we can replace the remaining product by 45''” 1 ; and, for sim- 
plicity, we will even replace it by 4r". Thus there exists an integer n > 3 
such that 4" < (2n) x+ ^' ■ 4i"; that is, such that 

4 § < ( 2 ny+v'Z" 

and for which Bertrand’s postulate is false. 

Now, for a few small values of n — take n = 10, for example — this 
inequality can actually hold true (try it). On the other hand, it is clear 
that for sufficiently large values of n this inequality will always fail to 
hold (because the exponent n on the left dominates the exponent yfi 
on the right as n gets large). Therefore, all that needs to be done to 
finish the proof by contradiction is to determine a specific number n 
sufficiently large so that the only values for which this inequality can 
hold true are values that are less than this number n, and then to check 
that Bertrand’s postulate is true for each of these numbers less than n. 
This detail is left for you to do in Problem 10.12. This completes Erdos’s 
proof of Bertrand’s postulate. ■ 
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Goldbach's Conjecture 

The regular correspondence that took place between Euler and Chris- 
tian Goldbach was mentioned twice in Chapter 7, a 1729 letter in 
which Goldbach asked Euler about Fermat primes, and a 1753 letter 
in which Euler discussed Fermat’s last theorem. But no letter in their 
vast correspondence is nearly as famous as the one Goldbach wrote to 
Euler — then living in Berlin— on June 7, 1742. 

In this letter he asked Euler (in a margin, no less) whether every 
integer greater than 2 is the sum of three primes. At this time, Goldbach 
considered 1 a prime, so he would write, for example, 4 = 1 + 1 + 2, and 
consider this to be a sum of three primes. Euler wrote back (in the rather 
strange mixture of German and Latin that was typical of their letters 
during this period) reminding Goldbach of earlier communication 
during which they had discussed a variation of this conjecture: every 
even integer is the sum of two primes. Again, Euler would have thought 
of 2 = 1 + 1 as a sum of two primes. In Problem 10.17 you are 
asked to show that these two variations of Goldbach’s question are 
equivalent. 

So, since Goldbach’s original question just boils down to the ques- 
tion of whether even integers greater than 2 are always a sum of two 
primes, we now state Goldbach's conjecture formally as: 


Every even integer greater than 2 can be written as the sum of two 
primes. 


Euler was quite confident that this conjecture is true even though 
he had no idea how to prove it. Even today it is considered one of 
the most difficult problems in mathematics. The publishers Faber and 
Faber offered $1,000,000 between March 20, 2000, and March 20, 
2002, for a proof of Goldbach’s conjecture as a publicity gimmick 
when they launched a novel by mathematician Apostolos Doxiadis, 
Uncle Petros and Goldbach's Conjecture. Needless to say, their money 
was safe, although there is a massive amount of evidence support- 
ing this conjecture. All even numbers have been checked up to 
4 000 000 000 000 000 000. 

It turns out that, like a good number of other things in mathematics, 
Goldbach’s conjecture is not even named for the right person; Descartes 
thought of this same question well before Goldbach and Euler did. Paul 
Erdos, however, took the position that in the grand scheme of things 
this particular case of misnaming was not such a bad thing, for, as he 
was fond of saying: “Mathematically speaking, Descartes was infinitely 
rich and Goldbach was very poor.” 
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Arithmetic Progressions 

In Problems 5.17 and 6.15 we used two different methods to prove that 
the sequence of odd integers 

1, 5, 9, 13, 17, 21. 25, 29, 33, 37, 

that is, integers of the form 1 + 4 n, contains infinitely many 
prime numbers. So, in this case, we know that there are infinitely 
many primes 5, 13. 17, 29, 37, . . . within this particular sequence of 
numbers. 

In 1775, Euler conjectured that more generally it is true that for any 
positive integer d, the sequence 

1, 1 + d, 1+2 d, 1 + 3 d, 1 + 4 d, 1 + 5 d, ..., 


that is, integers of the form 1 + nd, contains infinitely many prime 
numbers. 

Thus the sequence of integers 

1, 7, 13, 19, 25, 31, 37, 43, 49, 55, .... 

that is, integers of the form 1 + 6 n, would contain infinitely many 
prime numbers. So there would also be infinitely many primes 
7. 13, 19, 31, 37, . . . contained within this particular sequence of 
numbers. 

Ten years later, Legendre generalized this idea further and conjec- 
tured that if a and d are positive integers, and relatively prime, then the 
sequence 


a, a + d, a + 2d, a + 3 d, a + 4 d, a + 5 d 

that is, integers of the form a + nd, contains infinitely many prime 
numbers. In Problem 6.15 we have already seen two other instances 
where this more general conjecture holds true; namely, for the sequence 
of integers of the form 3+4 d and for the sequence of integers of the form 
5 + 6 d. 

In 1837, Dirichlet proved this conjecture, which is now appropriately 
called Dirichlet’s theorem, and which we state without proof. Recall 
from Chapter 1 that any sequence a , a + d, a + 2d, a + 3d, . . . — 
whether it is finite or infinite — is called an arithmetic progression, and 
that d represents the common difference between successive terms of the 
sequence. 
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Theorem 10.2 (Dirichlet's theorem). For any two positive , relatively 
prime integers a and d, the arithmetic progression 

a, a + d. a - 1- 2d, a 3 d . .... 

contains infinitely many primes. 


It is impossible for an infinite arithmetic progression to consist en- 
tirely of primes; in fact, any infinite arithmetic progression contains in- 
finitely many composite numbers (see Problem 10.22). However, it has 
long been conjectured that there exist arbitrarily long strings of prime 
numbers occurring consecutively within arithmetic progressions. 

A string of primes of length four occurs in the arithmetic progression 
1, 7. 13. 19, 25, 31, 37, 43, 49, 55, ..., namely, the four primes 
61, 67, 73, 79. This particular progression does not contain any strings 
of primes of length greater than four, though it does contain others of 
that length, such as the four primes 601, 607, 613, 619. Nonetheless, 
the conjecture has long been that, whatever length string of primes you 
have in mind, there exists an arithmetic progression having within it a 
string of consecutive primes of that length. 

In 2004, Ben Green of the University of Bristol in England and 
Terence Tao of the University of California, Los Angeles, settled this 
centuries-old conjecture and proved that the prime numbers contain 
arithmetic progressions of any finite length whatsoever! 

Terence Tao was born in 1975 in Australia, and by the time he was 
eleven in Adelaide, Terry was already participating in international 
mathematics competitions. In 1989, at age thirteen, he became the 
youngest gold medal winner ever in the International Mathematical 
Olympiad, having previously won the silver medal in 1988 and the 
bronze medal in 1987. 

In the summer of 2006, at the International Congress of Mathemati- 
cians held in Madrid, Spain, Terence Tao received the highest award 
in mathematics, the Fields Medal (the mathematical equivalent of the 
Nobel Prize), for his work in many areas of mathematics including the 
//-dimensional Kakeya problem, wave maps in general relativity, Horn’s 
conjecture, and nonlinear Schroedinger equations, and of course, for 
finally resolving (with Ben Green, now at Cambridge University) one of 
the most famous conjectures in number theory. 

The Green-Tao theorem, as it is now known, on arithmetic progres- 
sions of primes tells us that there exist arbitrarily long arithmetic pro- 
gressions of primes, but it does not tell us how to find them. The current 
record for the longest arithmetic progression of primes is twenty-six 
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primes. On April 12, 2010, Benoat Perichon found 

43 142 746 595 714 191 + 5 283 234 035 979 900 n , 

and on March 16, 2012, James Fry found 

3 4 86 107 472 997 423 + 371 891 575 5 2 5 470 n , 

each of which, for n = 0, 1,2 25, produces twenty-six primes. 

Paul Erdos was famous for asking interesting questions and he even 
offered monetary awards for solutions to problems he found especially 
interesting. In talks he always mentioned several of these problems and 
indicated the amount each problem is worth as a way of measuring 
how difficult he thought the problem is. Usually problems would be 
worth $100 or so, or sometimes for really difficult problems even $500 
or $ 1000. But there is one problem dealing with arithmetic progressions 
for which Erdos offered $5000: 

Let fli < «2 < a$ < ■ ■ ■ be a sequence of natural numbers such 
that -j; = oo. Is it true that this sequence of numbers contains 

arbitrarily long arithmetic progressions? 

Note that since the sum of the reciprocals of the primes diverges, 
the Green-Tao theorem is a special case of this conjecture. By the way, 
the $5000 prize for a solution is still available. But be warned that this 
problem seems exceedingly difficult. For example, it is not even known 
whether such a sequence contains an arithmetic progression with just 
three terms. 


Problems 

10.1 Use the table of primes at the back of the book to find the thirty-five 
pairs of twin primes below 1000. 

10.2 * The most recent years that were twin primes were 1997 and 1999. Find 

the next pair of years that will be twin primes. 

10.3 Brun’s theorem says that the series of reciprocals of twin primes 
converges. The real number to which this series converges, known as 
Brim's constant, is very difficult to evaluate but is approximately equal 
to 1.90. The reason this number is hard to evaluate accurately is that 
the series of reciprocals of twin primes converges extremely slowly. 
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(a) In this problem you are to graph a curve to illustrate how slowly 
this series converges to 1.90. There are fifteen pairs of twin 
primes below 200 beginning with the pair {3, 5) and ending 
with {197, 199). Plot points for this curve as follows. Start by 
taking 1 as your first x value, and the first partial sum i as the y 
value. Then take 2 as the next x value, and the partial sum 5 + 5 
as the next y value. 

In other words, you are to graph this curve by plotting (up to) 
thirty points: (1, ±), ( 2 , 3 + 3), (3, 5 + \ + 3), (4, £ + 3 + 3 + 

})• ( 5 - 3 + 3 + 5 + 7 + £)• • • • • ( 30 - 5 + 5 + ' ' ' + 359 ) • Note 
that the fraction i gets repeated because the prime 5 is a 
member of two different sets of twin primes: {3, 5) and {5, 7). 

Also draw the horizontal line y — 1 .90. Your curve should 
approach this line asymptotically. 

As a practical matter it won’t be necessary to plot all 30 
points, although this can be done easily using Mathematica or 
Maple. You probably need to plot the first dozen or so fairly 
carefully in order to see the true shape of this curve, but once 
the shape becomes more predictable you can plot fewer points. 
Be sure that you compute 3 + | + • • • + however. 

To plot these points in Mathematica, use the DiscretePlot 
command; you will also want to name a list of the twin primes, 
including 5 twice, that you can then refer to in your sum 
command. 

(b) Repeat part (a), but instead of using the series of reciprocals of 
twin primes, use the series of reciprocals of all odd primes; that 
is, the series 3 + j + 7 + n + '“ ■ Since there are 45 odd P rimes 
less than 200 you should, strictly speaking, plot 45 points. But 
since you can guess the general shape of this curve you only 
need to plot a few points to be able to draw it. Once you have 
drawn your curve decide whether you think this infinite series 
converges or diverges. Can you make a good guess at this using 
only these 45 primes? 

10.4 (S) In Problems 8 . 4-8 . 6 we saw that jt(x), the number of primes less than 
or equal to x, could be approximated by the logarithmic integral 
Li(x) = J 2 57 . Similarly, the number of pairs of twin primes less than 
or equal to x, which we denote by jr 2 (x), can be approximated by the 
function 2 C 2 Li 2 (x), where 


r x dt 
J2 In 2 t 


ki 2 (x) = 
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and where C 2 is the number n p > 3 ^rjfr (the product is taken over all 
odd primes p) known as the twin prime constant, and it is 
approximately equal to .66016. The conjecture that the function 
2C 2 Li 2 (x) is a good approximation to jr 2 (x) is due to G. H. Hardy and 
John Littlewood and is usually called the Hardy-Littlewood conjecture, 
though confusingly it is also sometimes referred to as the twin prime 
conjecture. 

(a) Recall that 7r 2 (1000) = 35. Compute 2C 2 Li 2 (1000). Then, as you 
did in Problem 8.5, find the ratio of 7r 2 (1000) and 2C 2 Li 2 (1000). 

(b) Recall that tt 2 (1 000 000) = 8169. Compute 2C 2 Li 2 (l 000 000) 
and find the ratio of 7r 2 (l 000 000) to 2C 2 Li 2 (l 000 000). 

(c) There are 3 424 506 pairs of twin primes less than 

1 000 000 000. Compute 2C 2 Li 2 (l 000 000 000) and find the 
ratio of tt 2 (1 000 000 000) to 2C 2 Li 2 (l 000 000 000). 

The data that have now been collected showing how closely 
2 C 2 Li 2 (x) approximates jt 2 (x) for extremely large values of x provide 
very strong support for the conjecture that there are infinitely many 
twin primes. 

10.5 * Prove that for any three consecutive odd numbers one of them must 

be divisible by 3. Conclude that if for some n the three numbers n, 
n + 2, and n + 4 are all prime, then n = 3. 

10.6 (H) Find all prime triplets less than 1000. 

10. 7 (S) After twin primes and prime triplets come prime quadruplets, prime 

quintuplets, and so on, each with their own conjectures. Give a 
definition for a set of four numbers to be a prime quadruplet, and give 
several examples. 

10.8 (H) (a) Since the series j diverges we know that the sequence of 

partial sums 

llllilllilllll 1 

2’ 2 + 3' 2 + 3 + 5' 2 + 3 + 5 + 7’ 2 + 3 + 5 + 7 + U 

marches off to infinity. Prove that, somewhat amazingly, these 
partial sums manage to miss every single integer during their 
march toward infinity. In other words, prove that for each k > 1 
if pi , p 2 , ■ ■ ■ , pk are the first k primes, then ]T? = i j- is not an 
integer. 
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(b) Prove that the same thing happens for the harmonic series 

7 as it diverges to infinity. In other words, prove that for 
any integer n > 1, the partial sum i + 2 + 3 + 5 + ''' + H is not 
an integer. This remarkable result was proved in 1915 by 
L. Taeisinger. 

10.9 You have perhaps seen the standard clever proof that the harmonic 
series XXi 7 diverges. The idea is simply to compare it to a “smaller” 
series that obviously diverges. Here is the harmonic series followed by 
a series that is “obviously” both smaller and diverging to infinity: 


l + i + ( 

'1 1\ 

\3 + 4 ) 

+ ( 

'1 1 1 1\ 

,5 + 6 + 7 + 8) 

+ ( 

'1 

■ + ^ + ' 

7 + U 

(1 u 

(l + 4. 

) + 

/I 1 1 U 

U + 8 + 8 + 8, 

) + 

(^ + ' 

" + ^) + 


In Problem 10.8(b) you considered the partial sums 

1111 1 

I + 2 + 3 + 4 + '" + u 

of the harmonic series. These partial sums are called the harmonic 
numbers and are denoted by H n ; that is, H n = Ya=\ }• 

(a) By considering the area under the curve y = \ from 1 to n, use 
circumscribed rectangles to show that 

H„ > In n + - . 

n 

Conclude that the harmonic series diverges. 

(b) By considering this same area, but now using inscribed 
rectangles, show that 

H n < In n + 1 . 

(c) Use parts (a) and (b) to conclude that H„ and In n are 
asymptotically equivalent; that is, show that Jim ^ = 1. 

Thus, for large n, In n is a good approximation for H„. 

10.10 Graph the two functions y = ln(l + x) and y = 2x and verify that 

ln(l + x) lies above 2x over the interval — \ < x < 0. This fact was used 
in the first proof that the series J2p \ diverges. Explain why the 
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function y = 3x could have been used in place of y = 2x. What effect 
would this have had on the proof? 

10.11 * (H) You may have been impressed that Bertrand had been able to verify 

his conjecture for all integers up to 3 000 000. But this is much easier 
than you might think. 

(a) Explain why the sequence of primes 

2 , 3 , 5 , 7 , 13 , 23 , 43 , 83 , 163 , 317 

can be used to verify that Bertrand’s postulate is true for all 
integers up to 300. 

(b) Find a similar sequence of primes that can be used to verify 
Bertrand’s postulate for all integers up to 5000. 

10.12 (S) Finish the proof of Bertrand’s postulate by finding a sufficiently large 

value of n > 3 such that the inequality 

qf < (2 «) 1+ ^" 

is false, and by then verifying Bertrand’s postulate for all integers up 
to and including this integer. 

10.13 (H,S) Prove the following variation on Bertrand’s postulate: for any 

integer n > 2, there is a prime p such that p < n < 2p. 

10.14 (H) Bertrand’s postulate has immediate corollaries. Fet pi, p 2 , p 3 , . . . , 

be the prime numbers listed in increasing order. 

(a) Prove that p„ +1 < 2 p„ for each n> 1. 

(b) Prove that p„ < 2" for each n > 2. 

10.15 (H,S) Use Bertrand’s postulate to prove that n\ is not a square for any 

n > 1. 

10.16 Goldbach’s conjecture does not claim that the representation of even 
integers as a sum of two primes is unique. Some even numbers, for 
example, 4, 6, and 8, do have unique representations; but it is far more 
typical for even numbers to have multiple representations as sums of 
two primes, such as 10 = 3 + 7 = 5 4- 5. 

Find an even number greater than 8 that has a unique 
representation as a sum of two primes. How many ways can you write 
100 as a sum of two primes? 
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10.17 (H,S) These days, in order to avoid the awkward detail that Goldbach 

thought of 1 as a prime number, we now rewrite the conjecture he first 
posed to Euler in the margin of his letter of June 7, 1742, as follows: 

Every integer greater than 5 can be written as the sum of three 
primes. 

Prove that this statement is equivalent to the Goldbach conjecture. 

10.18 Euler’s conjecture that arithmetic progressions of the form 

1, 1 + d, 1 + 2 d, 1 + 3 d, 1 + 4 d, 1 + 5 d 

contain infinitely many primes is a special case of Dirichlet’s 
theorem. 

(a) Find the next six primes after 7, 13, 19, 31, 37, and 43 in the 
following sequence: 

1, 7, 13, 19. 25. 31. 37, 43, 49, 55 

(b) Find ten primes in the following arithmetic progression: 

1, 11. 21, 31, 41, 51, 61, 71. 81, 91, ... 

— that is, integers of the form 1 + 10/;. 

10.19 (a) Find ten primes in the following arithmetic progression: 

5. 13. 21, 29, 37, 45, 53, 61, 69, 77, ... 

—that is, integers of the form 5 + 8/;. 

(b) Find ten primes in the following arithmetic progression: 

7, 17, 27, 37, 47, 57, 67, 77, 87, 97, ... 

—that is, integers of the form 7 + 10;;. 

10.20 (H) Does the arithmetic progression 7, 98, 189, . . . — that is, integers 

of the form 7 + 91;;— contain infinitely many primes? 

10.21 Use Dirichlet’s theorem on an appropriate arithmetic progression of 
the form a + nd to prove that there are infinitely many primes that are 
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not part of a twin prime pair — that is, there are infinitely many primes 
p such that neither p + 2 nor p - 2 is prime. 

10.22 (H) Note that in each of the four arithmetic progressions in 

Problems 10.18 and 10.19, when a prime p occurs in a progression, it 
divides the pth term that comes after it in the sequence. So, for 
example, in these four progressions, taken in order, we have 
7 1 49, 11 1 121, 5 1 45, and 7 1 77. 

Prove that this is always true. In other words, prove that if a term 
a + nd in an arithmetic progression is a prime number p, then that 
prime p will divide the term that appears p terms later in the 
progression. 

Conclude that no infinite arithmetic progression can consist 
entirely of prime numbers. Then argue that, in fact, any infinite 
arithmetic progression contains infinitely many composite numbers. 

10.23 Using a calculator that can factor large numbers, a computer, or Sage, 
verify that the arithmetic progression 

3 486 107 472 997 423 + 371 891 575 525 470 n 

produces twenty-six consecutive primes for the values 

n = 0, 1, 2 25. Also, while you are at it, you might as well 

confirm that this progression produces a composite number for 
n = 26. 

10.24 * (H) Some of the most famous open problems in mathematics deal with 

prime numbers: Goldbach’s conjecture, the twin prime conjecture, 
the Riemann hypothesis. However, many other interesting questions 
concerning primes have been posed over the years, and most of these 
remain unanswered. Here are just a few. 

(a) It has been conjectured that there are infinitely many primes of 
the form n\ + 1. Try each integer n from 0 to 10 and see how 
many produce primes of this form. The largest known prime of 
this form is 26 951! + 1, found in 2002. 

It has also been conjectured that there are infinitely many 
primes of the form ill - 1 . Try each integer n from 0 to 10 for 
this conjecture to see how many produce primes. The largest 
known prime of this form is 34 790! - 1 , also found in 
2002. 

(b) In 1922, G. H. Hardy and John Littlewood conjectured that 
there are infinitely many primes of the form n 2 + 1 . 
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For example, the primes 2, 5, 17, 37, and 101 are of this form. 
Find the next five primes of this form. 

As seen in part (a), conjectures often come in pairs. Why isn’t 
there a conjecture that there are infinitely many primes of the 
form n 2 - 1? 

On the other hand, it has been conjectured that there are 
infinitely many primes of the form n 2 + n + 1. Find the first ten 
primes of this form. 

(c) Bertrand’s postulate tells us that for every integer n > 1 we can 
always find a prime in the interval between n and 2 n. 
Surprisingly, it is not known whether for n > 1 there is always a 
prime in the interval between n 2 and (;; + l) 2 ; that is, in the 
interval between consecutive squares. 

Find a prime in each of the following intervals: between 7 2 
and 8 2 , between 10 2 and ll 2 , and between 25 2 and 26 2 . 

(d) In 1848, Polignac conjectured that every odd integer could be 
written as the sum of a prime and a power of 2; in other words, 
if n > 0 is an odd integer, then there is a prime p such that 

n — p + 2* for some k > 0. 

Express the number 125 in this form. Then show that 509 cannot 
be written in this manner. 


Sophie Germain 


One mathematician whose name will forever be linked with prime 
numbers is Sophie Germain. She was born in Paris on April 1, 1776, to 
a very well-to-do family — her father was a wealthy silk merchant and in 
1789 was elected as a deputy to the Constituent Assembly representing 
the Third Estate, which was soon to become the National Assembly and 
which, after the storming of the Bastille, began drafting a new French 
constitution. 

Germain became attracted to mathematics as a very young girl read- 
ing books on mathematics in the security of her father’s library while 
the French Revolution raged in the streets of Paris outside their home. 
Nonetheless, her parents strongly disapproved of their daughter’s un- 
seemly interest in mathematics, and she was forced to study secretly by a 
dim light in her bedroom far into the night. Remarkably, by the time the 
Reign of Terror (1793-94) had passed, Germain had been able — entirely 
on her own — to master most of the mathematics of her time, and was 
eager to embark on a path that would allow her to pursue her lifelong 
passion for mathematics. 


Monsieur LeBlanc 

Regrettably, in Germain’s time, educational opportunities were not 
open to women. When she was eighteen, the prestigious Ecole Poly- 
technique opened in Paris, but only male students could attend. 
However, Germain managed to obtain public lecture notes and could 
therefore send her own observations concerning his published notes 
on analysis to Lagrange at the Ecole by pretending to be a male student 
named M. LeBlanc — that is, Monsieur LeBlanc — earning his praise in 
return. 

Lagrange soon learned of Germain’s true identity; nonetheless, 
Lagrange went on to become one of her strongest supporters and a 
genuine mentor. Further admiration for her abilities would soon also 
come from Germany. Germain read Gauss’s Disquisitiones Arithmeticae, 
which he published in 1801, and in 1804 she began corresponding 
with Gauss about her work on Fermat’s last theorem, still using the 
pseudonym M. LeBlanc. In 1807, having by chance discovered her 
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deception, Gauss wrote a letter to Germain full of praise for her: 

How can I describe to you my admiration and astonishment at 
seeing my esteemed correspondent M. LeBlanc metamorphosed 
into this celebrated person who gives such a brilliant example of 
what I would find it difficult to believe. A taste for the abstract 
sciences in general and above all the mysteries of numbers is very 
rare: this is not surprising, the enchanting charms of this sublime 
science reveal themselves only to those who have the courage to 
go deeply into them. But when a person of her sex which, 
according to our customs and prejudices, encounters infinitely 
more difficulties than men in familiarizing herself with these 
thorny researches, succeeds nevertheless in overcoming these 
obstacles and penetrates the most obscure parts of them, then 
without doubt she must have the noblest courage, quite 
extraordinary talents, and superior genius. 

This famous passage from Gauss’s letter to Germain is utterly stunning. 
Here we have Gauss, a man not usually inclined to exhibit such a 
generous nature — yet who stands in history as perhaps the greatest of 
all mathematicians — and he is not only paying tribute to Germain’s 
mathematical ability, but he also seems to be genuinely aware of the 
tremendous challenges that women have faced throughout history in 
the academic world. 

As the first woman who did truly important original mathematics, 
Sophie Germain’s own place in the history of mathematics is secure. 
Although during her lifetime she was primarily known for her work 
on elasticity in the area of mathematical physics — she won a special 
prize competition from the Paris Academy of Sciences in 1816 for her 
mathematical explanation of the vibration of elastic membranes — she 
also made significant contributions in number theory as a result of her 
lifelong effort to prove Fermat’s last theorem, and it is now for her work 
in number theory that she is most famous. We will discuss her approach 
to Fermat’s last theorem in some detail. In fact, a question somewhat 
related to this theorem arose in Gauss’s letter of 1807. 

In this letter Gauss adds a remark on a point made by Germain in her 
previous letter to him saying that it seems she has stated a “converse 
proposition ... a bit too generally.” He then offers an example “where 
the rule fails.” The converse proposition to which Gauss is referring can 
be stated as: 

If a n + b n is of the form x 2 + ny 2 , then a + b is also of that same 
form. 
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Note that the exponent n in the first expression is the same n appearing 
as the coefficient in the second expression. It is clear in his letter that 
Gauss is assuming that n is prime. As an example of what must have 
been Germain’s original proposition we consider the coefficient n = 3: 
since we can write 2 + 5 = 7 = 2 2 + 3- l 2 , then we should also be able to 
express 2 3 + 5 3 = 8 + 125 = 133 in this same form; and, indeed we can, 
since 133 = 5 2 + 3 • 6 2 . Her converse proposition would reverse this and 
say that since 2 3 + 5 3 can be written in the form x 2 + 3 y 2 , then 2 + 5 can 
also be written in that same form. 

Here is Gauss’s astonishing counterexample showing that Germain’s 
converse “rule fails”: 

15" + 8 11 = (1 595 826) 2 + 11 • (745 391) 2 , 

but 15 + 8 = 23 cannot be written in the form x 2 + 11 y 2 . A fasci- 
nating detective story that reconstructs how Gauss must have gone 
about finding such an incredibly large counterexample is told in 
W. C. Waterhouse, “A Counterexample for Germain,” The American 
Mathematical Monthly 101(2) (February 1994), 140-50. 


Germain Primes 

Before we discuss Germain's work in number theory in any detail, let’s 
first look at the numbers arising from this work that are now named 
for her. The prime numbers we call Germain primes were central to her 
work on Fermat’s last theorem, and these numbers are now well enough 
known to the general public that they even occasionally become a topic 
of casual conversation, or at least they do in a Tony Award- and Pulitzer 
Prize-winning play. 

The central character in the play Proof, by David Auburn, is Catherine 
(played originally on Broadway by Mary-Louise Parker and later in 
the movie version by Gwyneth Paltrow), a young woman who has 
inherited the mathematical talent of her brilliant father. In one of 
the play's funnier scenes Catherine is teasing Hal, a slightly drunk 
graduate student, who has just made the mistake of claiming all “really 
original work” is done by young “guys,” so Catherine reminds him of 
Sophie Germain. Things only go downhill at this point for Hal when 
he then claims that he has probably seen Sophie at math meetings. 
When Catherine mentions to him that Sophie Germain was born in 
Paris in 1776, Hal has little choice but to admit he never met Sophie; 
eventually Hal manages to save face somewhat by at least remembering 
that Germain primes are famous: “Double them and add one, and 
you get another prime. Like two. Two is prime, doubled plus one is 
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five: also prime.” Catherine, however, can’t seem to resist completely 
upstaging Hal by responding: “-Right. Or 92 305 • 2 16 998 + 1.” 

A Germain prime, then, is a prime p such that 2p + 1 is also prime. So, 
as Hal says, 2 is a Germain prime because 2-2+1 = 5 is also prime. 
Other examples of Germain primes are 3, 5, 11, and 23 (because they 
are prime and because, respectively, 7, 11, 23, and 47 are also prime). 
We’ll just take Catherine’s word for it that 92 305 • 2 16998 + 1 is a 
Germain prime (see Problem 11.7). It is not known whether there are in- 
finitely many Germain primes. The largest known at this time, found in 
2012, is 

18 543 637 900 515 • 2 666 667 - 1 . 

It is not hard to see why Germain primes might be interesting num- 
bers to study. Here is just one simple example. Let p > 2 be a Germain 
prime. Then, since 2p + 1 is prime, we get by Theorem 5.2 (Fermat’s little 
theorem) that 


(2 P — 1 )(2 P + 1) = 2 zp — 1=0 (mod 2p + 1). 

Thus we can conclude that 2p + 1 divides 2 p - 1 or it divides 2 P + 1. In 
fact, we can even tell which of the two numbers 2p + 1 divides. 

If p is of the form p = 4k + 3, then the prime 2p + 1 is of the form 
8 k + 7, and so, by Theorem 8.3, 2 is a quadratic residue modulo 2p + 1; 
that is, there is an integer a such that a 2 = 2 (mod 2 p + 1). Hence, we 
can write 

2 p - 1 = (a 2 ) p - 1 =a 2p - 1 =0 (mod 2p + 1 ), 

again by Fermat’s little theorem, and we see that in fact 2p + 1 \2 p - 1. 
Now, the reason this is significant is that for such a Germain prime p, 
this means that the Mersenne number 2 P - 1 is not a Mersenne prime 
(see Problem 11.6). 

Conversely, if 2p + 1 1 2 P - 1 , we have 2 P - 1 =0 (mod 2p+l),andwe 
can evaluate the Legendre symbol (jfpi) as follows, using Theorem 8.1 
(Euler's criterion), 

(mod2 ' ,+1 >- 

Thus 2 is a quadratic residue modulo 2p + 1 and, again by Theorem 8.3, 
the prime 2p+l is of the form 8k + 7 (note that 2p+ 1 cannot ever have 
the form 8 k + 1); hence we conclude that p is of the form 4k + 3. 
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This result is interesting enough that it is worth recording as a 
theorem. 

Theorem 1 1 .1 . Let p > 2 be a Germain prime. Then 

2p + 1 J 2 P - 1 , for p = 4k + 3; and 2p + 1 1 2 P + 1 , for p = 4k + 1 . 

At the time when Germain first became interested in Fermat’s last 
theorem, his “theorem” had been proven for only two values of the 
exponent: for n = 4 by Fermat himself, and for n = 3 by Euler, whose 
slightly flawed proof was eventually completed by Legendre. Instead of 
trying to prove Fermat’s last theorem for a single exponent as others had 
done, Germain boldly made a frontal assault on the problem, and was 
the first person to attempt to deal in a more general way with infinitely 
many exponents. Germain had a plan, called her “grand plan,” for 
proving Fermat’s last theorem. The following theorem offers us a first 
glimpse of Germain’s grand plan. 

Theorem 1 1 .2. Let p be a Germain prime. Ifx p + y p = z p where x, y, and z 
are positive integers, then the prime 2p + 1 divides one ofx, y, or z. 

Proof 

Note that if x is relatively prime to 2p + 1, then ( x p ) 2 = x zp = 1 
(mod 2 p + 1), by Fermat’s little theorem. Thus, since 2p + 1 is prime 
we conclude that x p = ±1 (mod 2 p + 1). Similarly, if y and z are also 
relatively prime to 2 p + 1, then y p = ±1 (mod 2p + 1) and z p = ±1 
(mod 2p + 1). But this leads to a contradiction since x p + y p — z p and 
yet it is impossible to have x p + y p = z p (mod 2p + 1) when the only 
choices for each of the three terms are +1 and -1 . Therefore, one of x, 
v, or z is divisible by 2p + 1 . 

This completes the proof. ■ 

Note that the case p — 2 of this theorem, which says that in any 
Pythagorean triangle there is always one side that is divisible by 5, was 
proved earlier in Problem 3.14. 

Let's look at what this very simple theorem tells us. All it says, 
in effect, is that for p > 2 if for some Germain prime p you have a 
counterexample x p + y p = z p to Fermat’s last theorem, then there 
is a prime of the form q = 2p + 1 that divides one of x, y, or z. 
Since obviously x, y, and z have prime divisors anyway, how can the 
information that one of them has a prime divisor of a special form 
possibly help us prove Fermat’s last theorem? Well, Germain had a plan. 
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Germain's Grand Plan 

Here is the idea behind Germain’s ambitious “grand plan” to prove 
Fermat’s last theorem. She believed that for an odd prime p that pro- 


vided a counterexample to Fermat’s last theorem it would not be only a 


single prime q that divides one of x, y , or z, as in Theorem 11 .2, but that 


“the march of calculation indicates that there must be infinitely many 


(la marche du calcul indique qu’il doit s’entrouver une infinite).” 

She cites the case of p = 5 — that is, the case x 5 + y 5 = z 5 of Fermat’s 
last theorem — where the following primes would necessarily divide one 
of x, y, or z: 


2-5 + 1 = 11, 2-4-5 + 1=41, 2-7-5 + 1 = 71, 2-10-5+1 = 101. etc. 


But, since it is impossible for infinitely many primes to divide just three 
numbers, this would mean no counterexample could exist for p = 5. 

Thus Germain’s plan was to provide a method that for each odd 
prime p would produce infinitely many primes of the form q = 2 np + 1 
such that the primes q would necessarily divide one of x, y, or z for any 
given counterexample x p + y p — z p to Fermat’s last theorem. 

The key in Theorem 11.2 was that for any power x p relatively prime to 
2p+ 1 the power x p was congruent to either + 1 or - 1 modulo 2p+ 1 . The 
key in Germain’s general method is the notion of consecutive nonzero 
powers modulo 2 np + 1 . Let’s look carefully at an example. 

Consider p = 5 and assume that x 5 + y 5 = z 5 is a counterexample to 
Fermat’s last theorem where x, y, and z are positive integers. Further, we 
assume that x, y, and z are relatively prime (otherwise we divide through 
by their greatest common divisor). What happens if a prime q, say 11, 
does not divide any of x, y, or z? Then, since x is relatively prime to 11, 
we know by Theorem 6.1 that x has an inverse modulo 11 . Let a be that 
inverse; that is, ax = 1 (mod 11). So, we can multiply the congruence 
X s + y 5 = z s (mod 11) through by a 5 to get (ax) 5 + (ay) 5 = ( az ) 5 
(mod 11), which we can rewrite as 1 = (az) 5 - (ay) 5 (mod 11). Hence we 
conclude that modulo 11 the powers (az) 5 and (ay) 5 are consecutive as 
residues in the complete system of residues 0, 1 , 2. . . . , 10. Moreover, 
neither of these two powers can be congruent to 0 since 11 does not 
divide either y or z (and a is relatively prime to 1 1 ) . 

We conclude that if 11 does not divide any of x, y, or z, then there 
must be two consecutive nonzero fifth powers modulo 11. But there 
aren’t! If we compute the fifth powers modulo 11, we get only the 
following nonzero residues: 


I s = 3 s = 4 s = 5 s = 9 s = 1 and 2 s = 6 s = 7 s = 8 s = 10 s = 10. 
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Therefore, since there are only two nonzero fifth power residues, 1 and 
10, and they are not consecutive, we have a contradiction, and so 11 
divides one of x, y, or z. 

In exactly this same way, then, we can also conclude that each of the 
primes 2-4-5 + 1 = 41, 2-7-5 + 1 = 71,and2-10-5 + l = 101 must divide 
one of x, y,orz. For each of these primes of the form q = 2np + l,wecan 
compute the q - 1 nonzero fifth power residues modulo q and observe 
in each case that none of these residues are consecutive (see Problem 
11.8). 

It had been eleven years since they had last corresponded, but in 1819 
Germain wrote a letter to Gauss telling him of this plan for proving 
Fermat’s last theorem and describing her work: “Here is what I have 
found ( Void ce que j’ai trouve): ... .” Perhaps most significantly she 
reports to him in this letter that her work has demonstrated that 
Fermat’s equation could be satisfied only by numbers that are so large 
they would “frighten the imagination ( effraye Imagination).” But she 
also admits that she has “never been able to reach infinity (Jen'ai jamais 
pu arriver a I’infini . . . ).” 

Out of the large body of work that Germain did on Fermat’s last 
theorem there emerged one particular theorem that we now know as 
Sophie Germain’s theorem. This result was published as a footnote in a 
paper by Legendre in 1823 and was long thought to represent Germain's 
primary contribution to Fermat’s last theorem, though it should now be 
viewed as just one small part of that larger body of work. The proof we 
present, though unpublished, is essentially due to Germain. 

Theorem 11.3 (Sophie Germain's theorem). For an odd prime p, let 
q # p be a prime such that 

( 1) there are no consecutive nonzero pth power residues modulo q; and 

(2) p itself is not a pth power residue modulo q. 

Then, ifx p + y p — z p where x, y, and z are positive integers , one ofx, y, or 
z must be divisible by p 2 . 

Proof 

Assume that x p + y p = z p , where x, y, and z are positive integers. 
Further assume, without loss of generality, that x, y, and z are relatively 
prime. As we saw earlier, as a consequence of q satisfying condition (1), 
it must be the case that one of x, y, or z must be divisible by q. This 
is because if q does not divide any of x, y, or z, then we can divide the 
congruence x p + y p = z p (mod q) through by x p and we can in this way 
express 1 as the difference of two nonzero pth power residues modulo 
q, contradicting condition (1). 
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Suppose, for the moment, that q\z, and write 

z p = x p + y p — (x + y)(x p - x p ~ 2 y + x p ~ 3 y 2 - ■ ■ ■ - xy p ~ 2 + y p ~ l ). 

It turns out that q cannot divide both terms on the right side of this 
expression for z p . This is because if q\x + y, then x + y = 0 (mod q), and 
so y = -x (mod q), and we can substitute —x for y in the second term to 
getx p_1 +x p_1 + - ■ -+x p ~ x +x p ~ x = px p ~ x . So, if q also divides the second 
term, then q\x. But then q\x and q\x + y implies that q\y, contradicting 
the fact that x and y are relatively prime. 

Since this same argument works for any prime other than p, 
Germain concludes that x + y and x p_1 - x p ~ 2 y + • • • + y p_1 can have 
no common prime factors other than p. Virtually identical arguments 
starting with, respectively, the suppositions that the prime q divides x, 
or the prime q divides y , yield the further two conclusions that z- y and 
z p_1 + z p_2 y + • • • + y p_1 can have no common prime factors other than 
p; and z - x and z p_1 + z p ~ 2 x + • • • + x p ~ x can have no common prime 
factors other than p. 

Then, by way of contradiction, Germain assumes that x, y , and z are 
all relatively prime to p. In particular, p\z, so p divides neither x + y nor 
x p ~ x - x p ~ 2 y + ■ ■ ■ + y p ~ lm , hence x + y and x p ~ x - x p ~ 2 y + ■ ■ ■ + y p ~ x are 
relatively prime. Thus she writes 

x + y = l p and x p ~ x - x p ~ 2 y + ■ ■ ■ + y p ~ x — r p , 

where z = lr and / and r are relatively prime. Similarly, she writes 

z-y = h p and z p ~ x + z p ~ 2 y + ■ ■ ■ + y p ~ x = n p , 

where x = hn and h and n are relatively prime, and 

z — x = v p and z p ~ x + z p ~ 2 x + ■ ■ • + x p ~ x = m p , 

where y = vm and v and m are relatively prime. 

Next, Germain assumes (“to clarify the idea”) that it is z that the 
prime q divides. Then 


l p + h p + v p = (x + y) + (z - y) + (z - x) = 2z = 0 (mod q). 

So, since q satisfies condition (1), q must divide one of /, h, or v. But q 
can’t divide h or v since it would then in turn also divide y or x, violating 
our basic assumption that x, y, and z are relatively prime. Hence q\I 
and, as we have seen before, y = -x (mod q), and again making the 
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substitution -x for y we get r p = px p ~ l (mod q). Now, x itself is a pth 
power modulo q since x = — (z — x) = —v p (mod q), so we conclude 
that p is also a pth power modulo q, which violates condition (2). Since 
virtually identical arguments can be used if q divides a: or v (for example, 
ifq\x, we would use the congruence l p +h p -v p = (x+y)+(z-y)—(z—x) — 
2x = 0 (mod q)), this contradiction shows that p must divide one of x, 
y, or z. 

Now that Germain has p dividing one of x, y, or z, she again 
assumes that it is z that p divides. She then makes the claim that the 
only admissible option where z = Irp is to write 

x + y = l p p p ~ 1 and x p ~ l - x p ~ 2 y + • • • + y p ~ l = pr p . 


In order to verify her claim, it is useful to write w = x + y. Note that 
by Fermat’s little theorem x = x p (mod p) and y = y p (mod p), so 
w = x + y = x p + y p = z p (mod p), and so p\w. To verify her claim 
it is sufficient to show that p divides x p ~ x - x p ~ 2 y + • • • + yP _1 exactly 
once. Hence we write 


V P- 1 _ x p ~ 2 y + ■ • • + yp” 1 = ^ + yP = — + ~ X ^ P 


x + y 


x p + w p - ( p )w p -'x + L p _ 2 )w 2 x p - 2 + { p )wx p ~ l - x p 


U) 


W P 1 - (P J w p 2 x + 


p-2 


wx 


p-2 . 


P-1 


yP~ 1 


Since p\w and p | ( ^ 2 )> we see that p 2 divides each term in this expres- 
sion except the last; and only p, and not p 2 , divides the last term since 
{ p -i) xP ~ X = px p ~ x and p is relatively prime to x. Thus Germain’s claim 
is verified. 

Finally, to conclude that z is a multiple of p 2 , Germain considers 
the equation 2 z — (x + y) = h p + v p . Note first that p 2 | x + y since 
x + y — l p pP- 1 . So we’ll be done once we show that p 2 \h p + v p . But 
by Fermat’s little theorem h + v = h p + v p (mod p), and we know 
that p | h p + v p since p|z and p\x + y. Hence p\h + v, and h = -v 
(mod p). Thus h p = —v p (mod p). Since it is true in general that when 
a p = b p (mod p) then a p = b p (mod p 2 ) (see Problem 11.9), it follows 
that h p = —v p (mod p 2 ). Therefore, p 2 | h p + v p , and p 2 |z as desired. The 
argument for the cases where p divides tor p divides v are similar. 

This completes the proof of Sophie Germain’s theorem. ■ 


316 


Chapter 11 


We saw above that for p = 5, the prime q = 11 satisfies both 
of these conditions, so we can immediately conclude from Sophie 
Germain’s theorem that if x 5 + y s = z 5 is a counterexample to Fermat's 
last theorem, then one of x, y, or z must be divisible by 5 2 = 25. 
You may have already guessed that it was no accident that the prime 
q = 11 satisfies these two conditions precisely because 5 is a Germain 
prime. In fact, we have the following corollary (whose proof is left to 
Problem 11.10). 

Corollary 1 . Let pbea Germain prime. Ifx p + y p = z 1 ’ where x, y, and z are 
positive integers , then p 2 divides one ofx, v, or z. 

The mathematical literature is full of inaccuracies concerning Sophie 
Germain’s theorem. For example, it is not at all unusual for writers 
to refer to Corollary 1 as Sophie Germain’s theorem, a mistake that is 
sometimes compounded by giving the result a weaker conclusion: p 
divides one of x, y, or z. 

An excellent detailed analysis of the still existing manuscripts of 
Sophie Germain that both clarifies the historical record and allows 
for a significant reassessment of Germain's work in number theory 
(and whose title is taken from Germain’s 1819 letter to Gauss) can be 
found in the following paper, in print and online: R. Laubenbacher and 
D. Pengelley, “ ‘Void ceque j’ai trouve ’: Sophie Germain’s Grand Plan to 
Prove Fermat’s Last Theorem,” Historia Mathematica 37 (2010), 641-92, 
doi: 10.1016/j .hm. 2009. 12.002. 

Although they never met, Gauss had such high regard for Sophie 
Germain that he recommended an honorary doctorate be awarded her 
by the University of Gottingen. Sadly, she died in Paris on June 26, 1831, 
of breast cancer before this degree could be conferred. In 1882, a high 
school for girls was established in Paris and in 1888 it was renamed the 
Lycee Sophie Germain. 

Fermat's Last Theorem 

Sophie Germain was the first mathematician after Euler who made sig- 
nificant progress on Fermat’s last theorem. Fermat himself had proved 
the case n = 4 using his method of infinite descent. In 1753, Euler 
proved the case n = 3 (with a flaw that was later corrected by Legendre). 
Then, in 1825, Legendre and Dirichlet independently proved the case 
n = 5 basing their proofs on the work of Germain. 

A dramatic episode in the long saga of Fermat’s last theorem occurred 
in the spring of 1847. Gabriel Lame, who had proved the case n = 7 eight 
years earlier, announced to the Paris Academy of Sciences at a meeting 
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on March 1 that he had proved Fermat’s last theorem. Augustin-Louis 
Cauchy then announced that he too had a proof. However, both of 
these proofs came crashing down on May 24 when a letter arrived 
at the Academy from Ernst Eduard Kummer who pointed out that 
although unique factorization holds for the integers (see Theorem 3.4), 
it need not hold for number systems involving complex numbers such 
as those used in the two proofs in question. Lame and Cauchy had 
simply assumed that unique factorization would still hold. Nonetheless, 
Kummer could verify Fermat’s last theorem for all primes less than 100 
except for 37, 59, and 67. 

It would not be until the next century — and more than 350 years 
after Fermat wrote his famous note in the margin of the Arithmetica— 
that Fermat’s last theorem would finally be proved. Andrew Wiles 
announced a proof in the summer of 1993 in a series of lectures at 
Cambridge University. But a serious flaw in his proof was discovered 
in September and another year (and help from former student Richard 
Taylor) was needed for Wiles to fix his proof. 

Wiles’s proof was in fact the final piece in a very complicated puzzle 
that began in the 1950s with a conjecture connecting topology and 
number theory called the Taniyama-Shimura conjecture. This conjecture 
was first shown to be related to Fermat’s last theorem by Gerhard Frey 
in 1984. The key breakthrough came in 1986 when Ken Ribet proved 
that the Taniyama-Shimura conjecture implies Fermat’s last theorem. 
Thus Wiles’s final assault on Fermat’s last theorem was his ultimately 
successful attack on a special case of the Taniyama-Shimura conjecture. 


Problems 

11.1 Certain details contained in Gauss’s 1807 letter to Germain make it 
clear that her previous letter to him must have contained a proof of 
the following proposition: 

For a prime p of the form p = 4k + 3, if a + b is of the form 

x 2 + py 2 then a p + b p is also of that same form. 

(a) We illustrated this proposition by first writing 2 + 5 in the form 
2 2 + 3 • l 2 and then expressing 2 3 + 5 3 = 133 as 5 2 + 3 • 6 2 . Find 
all possible ways to express 133 in the form x 2 + 3y 2 where x 
and y are nonnegative integers. 

(b) Give an example to show that this proposition can fail for 
primes of the form p = Ak + 1. 
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11 .2 (H,S) (a) Confirm that, as Gauss claimed in his letter to Germain, 

15 + 8 = 23 cannot be written in the form x 2 + 11 y 2 . 

(b) In his letter to Germain, Gauss wrote 15 n + 8 11 as 

(1 595 826) 2 + 11 • (745 391) 2 . Write a computer program (using 
Sage, for example) to find another way to express 15 11 + 8 11 in 
the form x 2 + 11 v 2 where x and y are nonnegative integers. 

(c) Until Waterhouse reported the results of his detective work in 
his paper “A Counterexample for Germain”, it had long been a 
mystery how Gauss could have possibly found such a large 
counterexample. For the full story you should read 
Waterhouse’s paper, but at least part of the story is that Gauss 
found an error in Germain’s proof and the first case where this 
shows up is for n = 11 . He then realized he was looking for 
numbers a and b such that a + b is prime and also for another 
prime p of the form Ilk + 1 that would need to divide a n + b n . 
He could fairly quickly eliminate smaller primes as options for 
a + b and focus on a + b = 23; hence for a prime p of the form 
Ilk + 1 he would naturally try p = 67. 

Follow now in Gauss’s footsteps, starting with b = 1, until 
you find a pair of numbers a and b such that a + b — 23 and 
a u + b u = 0 (mod 67). Recall that you can use the method of 
fast exponentiation to compute powers modulo 67 quickly. 

(d) Continue as in part (c) and find another pair of numbers a and b 
such that a + b = 23 and a 11 + b n = 0 (mod 67). Then show 
that these numbers also produce a counterexample to 
Germain’s converse proposition since for this new pair a and b 

a n + b u = (661 539) 2 + 11 • (363 634) 2 . 

11.3 * (S) Gauss introduced the useful notion of indices in Disquisitiones 

Arithmeticae. Indices are useful in number theory in exactly the same 
way that logarithms are useful, and for the same reason: they are 
exponents. For a given primitive root of a prime p the index of a 
number a, denoted by ind a, is the exponent to which the primitive 
root must be raised to get the number a modulo the prime p. So, since 
2 is a primitive root for the prime 67, we can compute the powers of 2 
modulo 67: 

2' = 2, 2 2 = 4, 2 3 = 8, 2 4 = 16, 2 5 = 32. 2 6 = 64, 2 7 = 61 

and we see that, for example, ind 8 = 3 and ind 61 = 7. Just as we do 
with logarithms where we write log 10 or log 2 in order to clearly denote 
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which base is being used, we can also write ind 2 61 = 7 if we wish to 
make clear that 2 is the primitive root that is being used in this case. 

(a) If indices are to be truly useful as an aid to computation, tables 
of indices must be available. So continue the above calculation 
to complete the following table of indices for the prime 67 and 
the primitive root 2: 


a 

1 

2 

3 

4 


8 


61 


64 


67 

ind 2 a 


1 


2 


3 


7 


6 


X 


Gauss produced exactly such a table for Disquisitiones 
Arithmeticae. 

(b) Now do parts (c) and (d) of Problem 11.2 again, but this time 
exactly as Gauss would have done, using the table of indices 
from part (a). For example, instead of computing (say) 10 u 
directly as you did in Problem 11.2, first use the table to find 
ind 10 = 16; thus 


10 11 = (2 16 ) 11 = 2 176 = 2 44 (mod 67), 

and then, looking again at the table to see what number has 44 
as its index, you conclude that 10 11 = 29 (mod 67). Note that in 
this computation we use the fact 2 66 = 1 (mod 67) to reduce 
2 176 to 2 44 modulo 67. 

11.4 (H) One key property that Germain would surely have used to prove the 
proposition stated in Problem 11.1 is that the product of two numbers 
of the form x 2 + ny 2 is again a number of this same form; that is, the 
following formula holds: 

(x 2 + ny 2 ){x 2 x + ny 2 ) = {xx x - rz.vyi) 2 + n(xy 1 + yx x ) 2 . 

(a) Verify this formula. What other formulas does this remind you 
of? 

(b) Illustrate how to use this formula to write 133 in the form 
x 2 + 3 y 2 by first factoring 133 into primes and writing each 
prime factor in this form. 

(c) Illustrate Germain’s proposition by first writing 11 + 20 in the 
form x 2 + 3y 2 , and then using this formula to also write 

11 3 + 20 3 in this same form. (Recall from Problem 8.7 that 
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Fermat knew exactly which primes can be written in the form 
x 2 + 3y 2 .) 

11.5 Another property that Germain undoubtedly used to prove the 
proposition stated in Problem 11.1 involves a polynomial identity 
from Disquisitiones Arithmeticae. Although this identity is actually 
more general, we will illustrate it only for the prime p = 11. Keep in 
mind that the goal in Germain’s proposition is to show that if a + b is 
of the form x 2 + llv 2 , then so is a 11 + b n . 

(a) Verify the following identity: 

4 . c ? — + — = (-2 a 5 + a 4 b + 2 a 3 b 2 + 2 a 2 b 3 + ab 4 - 2b 5 ) 2 
a + b 

+11 (a 4 b - ab 4 ) 2 . 

(b) Conclude that if a + b is of the form x 2 + 11 v 2 , then so is 

a n +b 11 . 

11.6 (S) (a) Find all Germain primes less than 250. 

(b) Make a conjecture about what form a Germain prime p must 
have modulo 6 for p > 3. Then prove your conjecture. 

(c) Mersenne claimed in 1644 that he could list all the primes of 
the form 2 p - 1 for any prime p < 257. Of course, his list 
contained several errors, as discussed in Chapter 5. 

Theorem 11.1 tells us that for a Germain prime p > 3 of the 
form p = 4k + 3 the corresponding prime 2p + 1 divides the 
Mersenne number 2 P - 1; hence, for such a Germain 
prime p, the Mersenne number 2 P - 1 is not a Mersenne 
prime. 

Which primes p in the range from 1 to 257 can be eliminated 
in this way as candidates for producing a Mersenne prime 
2 P - 1? 

For the first four of these Germain primes (or for as many of 
these as you can do with your calculator or using Sage), actually 
verify that 2p + 1 1 2 p - 1 . Then, for the first five of the Germain 
primes in this range of the form 4 k + 1 (or, again, for as many as 
you can do), actually verify that, as guaranteed by Theorem 
11.1, 2p + l|2 p + 1. 

11.7 ★ In David Auburn’s play Proof, which opened in New York in 2000, the 

character Catherine says that the number 92 305 • 2 16 998 + 1 is the 
biggest known Germain prime. Use Sage to show that Catherine’s 
number is in fact a Germain prime. 
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Now imagine that you are playing the role of Catherine in a current 
production of Proof. Try to memorize the updated line you now have 
to deliver on stage: “Right. Or 18 543 637 900 515 • 2 666 667 - 1.” 

11.8 (S) (a) Reproduce a small portion of the “march of calculation,” as 

Germain put it, that leads us to believe that, for example, in the 
case of p = 5, there would be infinitely many primes that would 
necessarily divide one of x, y, or zif x s + v 5 = z s . In other words, 
as we did for the prime 11, conclude that each of the primes 
2- 4 ■ 5 + 1 = 41, 2- 7 ■ 5 + 1 = 71, and 2 • 10- 5 + 1 = 101 must 
divide one of x, y, or z. 

(b) Explain why, in the case of p — 5, we do not include the primes 
2 • 3 • 5 + 1 = 31 or 2 ■ 6 • 5 + 1 = 61 in this “march of calculation.” 

11.9 (H) Prove that for any prime p, if a p = b p (mod p), then a p = b p 

(mod p 2 ). 

11.10 * (H) Prove the corollary to Theorem 11.3 by showing that for a Germain 

prime p, the prime q — 2p + 1 satisfies the two conditions of Theorem 

11.3. 

11.11 Germain eventually learned that her grand plan could not succeed for 
all primes p > 2, as she had hoped. In fact she wrote a three-page letter 
to Legendre containing a proof that for p = 3, the primes 7 and 13 are 
the only primes of the form q - 2 np + 1 having the property that none 
of the nonzero third power residues modulo q are consecutive (this 
undated letter is now in the New York Public Library). 

(a) Verify that for p = 3 the primes 2 • 1 • 3 + 1 = 7 and 

2 • 2 • 3 + 1 = 13 satisfy this nonconsecutive residue property for 
third powers modulo 7 and modulo 13, respectively. 

Then, also verify that the primes 2 • 3 • 3 + 1 = 19 and 
2 • 5 • 3 4- 1 = 31 do not satisfy this property. 

We can illustrate Germain’s proof that for p = 3 there will 
always be consecutive nonzero third power residues for primes 
of the form q = 2 np + 1 whenever q > 13 by considering the 
case q — 19. Let r be a primitive root of the prime 19. Since a 
residue r' will be a cube only when i is a multiple of 3, we know 
that of the eighteen nonzero residues modulo 19 exactly six of 
these will be cubic residues (that is, third powers) and twelve will 
be cubic nonresidues (that is, not cubic residues). 

The idea behind Germain’s proof is to consider how the six 
cubic residues are distributed among the eighteen nonzero 
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residues. First, we will consider the case where no two nonzero 
cubic residues have a difference of either 1 or 2. This means that 
in a gap between two cubic residues there must be at least two 
cubic nonresidues. Since there are twelve cubic nonresidues and 
five gaps between the cubic residues, if we ignore the order of 
these gaps for the moment, there are only two possibilities for 
the overall gap sizes, either 33222 or 4222 2. 

However, the gaps must fall symmetrically (since 
(-a) 3 = -a 3 ), so there are three possibilities for the gaps: 

2242 2, 3222 3, or 2323 2. The first option is impossible 
because it yields f, 4, 7. 12, 15, 18 as the cubic residues and this 
misses the cubic residue 8. The second option yields 
1, 4, 8, 11, 15, 18 as the cubic residues. But this is also 
impossible because if 4 is a cubic residue, then 2 would also have 
to be a cubic residue since 2 = 8/4 (let 4 = a 3 and write ab = 1 
(mod 19) where b is the multiplicative inverse of a, then 
4i> 3 = a 3 b 3 = 1 (mod 19), and 2 = 8£> 3 = (2b) 3 (mod 19)). We 
are thus left with the third option, which yields 
1, 5, 8, 11, 14, 18 as the sequence of cubic residues. However, 
this sequence is missing the cubic residue 4 3 = 7 (mod 19). This 
contradiction means that there are two cubic residues having a 
difference of either 1 or 2. 

We next consider the case where no two nonzero cubic 
residues have a difference of 1, but there are two nonzero cubic 
residues x and y such that x - y = 2. In this case, 2 is not a cubic 
residue (since 1 is a cubic residue), and so we can write 

2 = r 3k±l ( moc j 19) 

Now, x and v cannot be a pair of additive inverses, that is, 
x + v # 0 (mod 19), since if x + y — 19, then 
19 = x + y = (x — y) + 2y — 2 4- 2y would be even. Therefore, 
x + y is a nonzero residue, but it can’t be a cubic residue since if 
x + y = c 3 (mod 19) where x = a 3 (mod 19) and y = b 3 
(mod 19), then a 3 = c 3 - b 3 (mod 19), and we could divide 
through by a 3 (that is, multiply through by the cube of the 
multiplicative inverse of a), and we would have two cubes whose 
difference is 1, which is impossible in this case. So we can write 

x + y = r 3,±1 (mod 19). 

We claim that the plus and minus signs in the expressions 
2 = r 3k±x and x + y = r 3 ^ 1 must agree. Suppose, for example, 
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that 2 = r 3k+ 1 and x + y = r 3 ' -1 , then x 2 - y 2 = 

(x - y)(x + y) — 2(x + y) = r 3 k+l r 3 i~ l — r 3( * +/ ) (mod 19), and 
since x 2 and y 2 are both cubic residues (x and y are cubic 
residues after all), we again have a contradiction where a cubic 
residue is a difference of two cubic residues. The argument is the 
same if the signs are reversed. 

Finally, write 2x = (x - y) + (x + y) = 2 + (x + y) = 
r 3k± i + r 3,±1 (mod 19), and we can divide this by 2 — that is by 
r 3k±i — to get x = 1 + r 3 b-9 (mod 19), which is a contradiction 
since we now have two cubic residues, * and r 3 ( i~ k) , whose 
difference is 1. 

Having eliminated two cases, we are left with the conclusion 
that there are two nonzero cubic residues whose difference is 1, 
that is, two consecutive nonzero cubic residues. 

(b) Show how this same proof would work for the prime q = 31 . In 
particular, determine what options there are for the pattern of 
gap sizes between the cubic residues; and for each option give a 
reason why the resulting sequence of cubic residues is 
impossible. The argument you use here for q = 31 should ideally 
also work for all larger primes of the form 2 np + 1. 



Fibonacci Numbers 


The Fibonacci sequence 

1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, ... 

is certainly the most famous of all number sequences, and although 
these numbers may truly be said to exist in nature, historically they 
come to us from medieval Italy. It is somewhat misleading that we 
associate these numbers so strongly with the name Fibonacci since they 
held almost no interest for him and appeared only briefly as a solution 
to a problem in his most famous book, Liber abaci: 

A man put one pair of rabbits in a certain place entirely enclosed 
by a wall. How many pairs of rabbits were created from that one 
pair in one year, if it is the nature of rabbits that each month 
every (mature) pair bears another new pair, which in the second 
month also begin to bear? 

It is likely that Fibonacci himself was not even aware of the property 
that makes the sequence which is named for him so famous: namely, 
that if we let F„ be the nth Fibonacci number, then 

F„ = F„ _i + F n — 2 . 

This recurrence relation was not given as a definition for the Fibonacci 
sequence until 1634 by Albert Girard, and it was not until the nine- 
teenth century that this sequence was first named for Fibonacci by 
the French mathematician Edouard Lucas — who appropriately now has 
a similar sequence named after him. In Problem 12.1 you are asked 
to show that the solution to Fibonacci’s “rabbit problem” is indeed a 
Fibonacci number. 

Although the Fibonacci numbers will forever be associated with 
Fibonacci and his rabbits, these numbers were studied in India more 
than two thousand years ago. They appear, for example, in a standard 
book on Sanskrit metrics written around 200 B.C., the Chandahsutra 
(Prosody Rules) by Piiigala. 

The basic units in Sanskrit prosody are syllables with one “mora” 
(called “light”), and syllables with two moras (called “heavy”). If we 
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denote a light syllable by L and a heavy syllable by H, then we can 
count the number of possible meters with a given number of moras. 
There is only one possible variation for a one-mora meter (L). There are 
two variations for a two-mora meter (LL, H). There are three variations 
for a three-mora meter ( HL , LLL , LH) and five variations for a four- 
mora meter (HLL, HH, LHL. LLLL, LLH). In general, the number of 
possible variations for a meter with n moras is the Fibonacci number 
F n+ 1 (see Problem 12.2). 


Fibonacci 

Leonardo of Pisa, better known as Fibonacci, was born in Pisa around 
1180 and was to become the single most influential mathematician 
in Europe during the later Middle Ages in the twelfth and thirteenth 
centuries. This was a time when Arab texts and Greek manuscripts were 
gradually making their way along the great Mediterranean trade routes 
into Europe where they would eventually be collected and translated. 

Although Fibonacci was born in Italy, he grew up in North Africa, 
where his father was working. As a result he was able to study under 
Muslim teachers and travel widely in Egypt, Syria, and Greece. It was 
during these travels that he learned of the Hindu-Arabic number system 
and methods of calculation. Fibonacci begins Liber abaci, written in 
1202, by explaining that in India there are nine figures and that with 
these nine figures and the symbol 0 any number can be expressed. 
He then uses a variety of problems taken from Arab sources such as 
the works of al-Khwarizmi and abu-Kamil and from Greek texts to 
demonstrate how very much better the Hindu-Arabic number system 
is than the Roman system. The second edition of Liber abaci came 
out in 1228, and the use of the Hindu-Arabic number system for the 
calculations needed in such practical matters as everyday commerce 
and surveying quickly spread throughout Europe. 


The Fibonacci Sequence 

The Fibonacci sequence 1, 1. 2. 3, 5, 8. 13, ... , has an extraordinary 
number of interesting properties that people have noticed through the 
years. We begin our study of this fascinating sequence by unlocking 
what is perhaps its single most amazing property. 

First, we observe that the Fibonacci sequence j F„] is defined 
recursively by 


Fi = 1, F 2 = 1, and F„ = F„-\ + F„_ 2 for n > 2. 
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Now, let a be the irrational number a = . This is one of the most 

famous numbers in all of mathematics. It is called the golden ratio, and 
we will soon discover what this particular irrational number has to do 
with the Fibonacci sequence. 

Similarly, let ft be the number p = These two numbers a and 

p were not just randomly pulled out of a hat. They are the two roots of 
the characteristic polynomial x 2 - x - 1 of the recurrence relation for 
the Fibonacci sequence. More generally, the characteristic polynomial for 
a recurrence relation of the form s„ — as n i + £>s„_ 2 > where a and b are 
nonzero constants, is the polynomial x 2 -ax- b. This means that a and 
P are the two solutions to the equation x 2 - x - 1 =0. You should check 
this either by using the quadratic formula or by actually plugging a and 
P into this equation. 

You should also verify for yourself that a and p satisfy two very 
convenient identities: 

a + p — 1 and ap = — 1. 

We will use these two identities over and over again in this chapter. 

Therefore, we can now rewrite the Fibonacci recurrence relation F„ = 
F n -i + F„- 2 as F n = (a + p)F n - 1 - (aP)F„- 2 , and get 

F n - uF n - 1 = P(F „- 1 - aF„-2) . 

This is indeed quite surprising, because it means that the “linked” 
sequence of terms 

1 -a ■ 1 

2-a ■ 1 

3 — cr • 2 

5 — a ■ 3 

8 -a ■ 5 


13 -cr -8 
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forms a geometric sequence whose constant ratio is ft. (Now would be a 
good time to do Problem 12.5.) 

This means that F„ — aF n ^\ = (1 - a)ft"~ 2 ; or, in other words, since 
a + f — 1 (that is, 1 - a = (5), 

F„ = aF n - 1 +p n ~ 1 . 

Moreover, since in this entire argument we never used the specific 
values of a and ft, but rather used only the two symmetric formulas 
a + ft = 1 and af = - 1 , we can switch the roles of a and f and produce 
another identity: 


F n = pF n - l +a"~\ 

We are now ready to prove a truly remarkable property of the Fi- 
bonacci sequence: if we look at the sequence consisting of the ratios of 
consecutive terms in the Fibonacci sequence, that is, the ratios 

1 2 3 5 8 13 21 
T' I’ 2 ’ 3’ 5’ ~8~’ 13’ ’ " ’ 

then we see that this sequence seems to be converging to a number 
slightly less than 1.62 (see Problem 12.7). In fact, the number this 
sequence is converging to is the golden ratio, a — which is ap- 

proximately 1.618034. This intimate connection between the Fibonacci 
sequence and the golden ratio was first discovered by the astronomer 
Johannes Kepler, who referred to the golden ratio as a “precious jewel.” 

Theorem 12.1. If F i, F 2 , F 3 , . . ., is the Fibonacci sequence, then the 
sequence { } of ratios of consecutive terms converges to a = — as ngoes 
to infinity. Moreover, these ratios fall alternately above or below a according 
to whether n is even or odd. 


Proof 

From the preceding discussion we have F n = aF n _ 1 + f n ~ l or, equiva- 
lently, F n+ 1 = aF n + fi". Dividing by F n we get 




First, we note that since f is negative, > a when n is even, and 
< a when n is odd. Next, since j/S| < 1 and since F n 00 , 
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we conclude that lim,,.^ = 0. Therefore, 


lim 

n-+oc 



= a, 


and this completes the proof. 


The Golden Ratio 

We first met the irrational number a known as the golden ratio in 
Problem 3.36 as the infinite continued fraction 


a — 1 + 


1 + 


In fact, this was our first clue that a might be a particularly interesting 
number. 

Since an identical copy of this continued fraction appears within 
itself below each fraction bar, we can immediately see that a must satisfy 
the following equations: 

x = 1 + x = 1 -\ — , * 

^ 1 
1 + - 
x 


In particular, the first equation reduces to the quadratic equation x 2 — 
x - 1 =0, which has two solutions: a positive solution a = 
and a negative solution This of course is the same equation 

that arose in the last section when we considered the characteristic 
polynomial of the recurrence relation for the Fibonacci sequence. 

It is worth mentioning that the Greek letter (p (phi) is often used 
for the number , though this is by no means universal. Other 
Greek letters are also used for the golden ratio, such as r (tau), and the 
letter </> is frequently used to stand for a wide variety of other things in 
mathematics and physics. 

We should also consider why the number -4^ is called a ratio. This is 
a geometric idea that, not surprisingly, goes back to the ancient Greeks 
and first appeared in Euclid’s Elements. If you take a line segment and 
cut it into two parts so that the ratio of the larger of the two parts to the 


= 1 


1 


1 

1 + - 
x 
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smaller of the two parts equals the ratio of the original segment to the 
larger part, then this ratio is called the golden ratio. 

If we let the original line segment have length x and the larger part 
of this line segment have length t, then the smaller part has length 
x — 1. Hence we get the equation ~ — which again reduces to 
x 2 — x - 1 =0. Since in this case x must be positive, the positive solution 
a = i s the golden ratio. In Problem 12.8 we show how to construct 
a line segment whose length is the golden ratio. 

The golden ratio has a habit of showing up unexpectedly. For exam- 
ple, if you draw a regular pentagon where each side has length 1, then 
any diagonal of this pentagon — that is, a line segment drawn from any 
vertex to either of the two opposite vertices — will have length a = 

(see Problem 12.9). 

A great deal of nonsense has been written about the golden ra- 
tio. For example, many people firmly believe that the builders of the 
Great Pyramid of Giza based its dimensions on the golden ratio (see 
Problem 12.11). Similarly, the Parthenon supposedly was constructed 
to exhibit various aesthetic qualities of the golden ratio. However, 
there is no reason at all to believe that the Greeks ever thought of the 
golden ratio as anything more than a number that is mathematically 
interesting or, similarly, that the “appearance” of the golden ratio in 
Egyptian pyramids is anything more than a numerical coincidence. 

It was during the Renaissance that the aesthetic potential of the 
golden ratio began to be explored. Most notably, in 1509, a Francis- 
can friar, Luca Pacioli — now known as the “father of accounting” — 
published a work entitled De Divina Proportione (illustrated by Leonardo 
da Vinci). Pacioli referred to the golden ratio as the “divine proportion” 
and attached considerable religious significance to this number. One 
of Pacioli’s strangest ideas relating to the aesthetics of the golden ratio 
was that in an ideal human body one’s belly button should divide one’s 
height into the golden ratio! We previously encountered another of 
Pacioli’s ideas in Problem 5.15. 

These days the golden ratio is often defined in terms of a rectangle 
whose dimensions are claimed to make it the most aesthetically pleas- 
ing among all rectangles (a dubious claim at best). This rectangle, called 
the golden rectangle, is one for which the ratio of its length to its height 
is a = What makes this approach appealing is that the golden 

rectangle can be defined in purely geometric terms because it has an 
intriguing property that it contains infinitely many copies of itself — - 
much as the continued fraction representation of a does. 

We define a golden rectangle to be a rectangle of height y and length x 
such that if a square is formed as in the rectangle shown in Figure 12.1 
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x 

Figure 12.1 The golden rectangle. 

(where we arbitrarily let y = 1), then the small rectangle that is formed 
on the right will have exactly the same proportions as the original 
rectangle; hence this rectangle will also be a golden rectangle. 

Since these two rectangles have the same proportions, we get the 
equation f = which reduces as expected to x 2 - x - 1; therefore, in 
this golden rectangle, x = a, and the ratio of the length to the height 
is indeed the golden ratio. Note that this process could be repeated 
with the smaller rectangle, producing yet another even smaller golden 
rectangle; in this way, any golden rectangle contains infinitely many 
copies of itself. 

The suggestion that the dimensions of the Parthenon’s original fa- 
cade formed a golden rectangle has frequently been offered, but it is 
highly unlikely that this was intentional. Similarly, golden rectangles 
can be envisioned in the paintings of Leonardo da Vinci, but there 
is no evidence to support the idea that he was using such rectangles 
consciously in the composition of these paintings. On the other hand, a 
number of more recent artists have deliberately used golden rectangles 
in their work, for instance, Georges Seurat, Salvador Dali, and Piet 
Mondrian, as well as the Swiss architect Le Corbusier. A similar situ- 
ation exists in music; for example, the golden ratio appears in works 
by Claude Debussy although there is no evidence that this was done 
consciously, but it is not at all uncommon for modern composers to 
deliberately use the golden ratio in their compositions. 

However, the golden ratio does just keep showing up. It turns out 
that the icosahedron contains several golden rectangles. If you pick 
up a model of an icosahedron holding it in your fingers by a pair of 
opposite edges, then you are holding the two ends of a golden rectangle 
that runs right through the middle of the icosahedron. This is because 
the long edge of this rectangle is a diagonal of a regular pentagon (see 
Problem 12.9). In fact, since an icosahedron has thirty edges; it has 
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fifteen pairs of opposite edges; hence an icosahedron contains fifteen 
golden rectangles! 

Fibonacci Numbers in Nature 

Theorem 12.1 tells us that the Fibonacci numbers are closely related to 
the golden ratio a = . In this section we pursue the possibility that 

Nature may have discovered this fact long before mathematicians did. 

Figure 12.2 is a picture of a flower with very obvious spiral patterns 
radiating out from the center. In particular, if you count the spirals 
that radiate out in a clockwise direction you will discover that there are 
55 = / ; io of these spirals. If you count counterclockwise you will find 
34 = F 9 . 

This can hardly be a coincidence. Something is going on here if the 
natural world can produce these two Fibonacci numbers in a single 
flower. Moreover, other plants exhibit similar patterns. If you look at a 
pineapple, the spiral pattern jumps out at you: the bracts on a pineapple 
almost always form thirteen spirals in one direction and eight in the 
other. The buds on an asparagus spear also form spirals, five clockwise 
and three counterclockwise. Pinecones do the same thing. 

What all of these plants have in common we can describe in terms 
of a flower such as the one in the picture. As the flower grows, think of 
the stem of the flower producing one floret at a time in the center of 
the flower — this happens at the growing tip, called the meristem. Then, 
as each floret grows and needs to occupy space it will move radially 



Figure 12.2 Clockwise and counterclockwise spirals in a sunflower. 
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outward to make room for new florets as more and more florets are 
produced in the center. Where should these florets be produced on the 
edge of this circular meristem at the center of the flower? 

One obvious answer would be if there were a fixed number of posi- 
tions on the meristem. This doesn’t seem like such a good idea since this 
would leave a lot of wasted space between florets as they moved radially 
outward in a fixed number of straight lines from the center. So plants 
that happen to choose this strategy would not be very well adapted to 
survive and pass this strategy on to their descendants. Note, however, 
that at least one very familiar plant evolved adopting this very strategy. 
As new kernels of corn are produced in the center at the tip of a corn 
cob, they do line up in perfect rows with all the other kernels. Corn 
“solved” the problem of wasted space by evolving a thick cylindrical 
surface upon which to place these straight rows of kernels. 

So the florets in a flower need to be produced in a way that will 
not end up wasting space. In 1868 the botanist Wilhelm Hofmeister 
provided several basic principles for plant growth. The florets would 
be produced one at a time in the center in the least crowded spot at 
the edge of the circular meristem, and then they would move radially 
outward as they grow, though gradually slower as they approach the 
edge. Computer-generated images using Hofmeister’s principles closely 
resemble what occurs with actual plants. 

Still, it is one thing for a computer program at each stage to do some 
simple calculations to find the point on the perimeter of the meristem 
that is farthest away from the existing florets, but how does a plant do 
this? It seems that what many plants do instead, but with the same 
result, is use what botanists call a divergence angle, and each new floret 
is produced on the perimeter of the meristem at a fixed-angle 0 from 
the previous floret. Moreover, among plants that use this fixed-angle 
approach to produce spiral patterns, about 92% use the angle 222.5°. 
This includes those that use the angle 360° - 222.5° = 137.5°, which 
merely reverses the direction of rotation (apparently, plants may prefer 
one angle or the other depending upon the hemisphere they are in). 

Now things get really interesting because 222.5° % where a = 
^4^. There are good mathematical reasons why nature seems to have 
“selected” the angle as such a common divergence angle for plant 
growth. First, let’s see how this produces Fibonacci numbers. Imagine 
a flower just at the moment it has produced its 144th floret, and let’s 
number these florets by their age. So floret 144 is the oldest and is on 
the perimeter of the flower, while floret 1 has just been produced and is 
still right next to the meristem. 

Where is floret 89? We claim it is right next to floret 144, slightly 
inside of it in fact. Here is the reason. Floret 144 was produced at a 
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certain position on the meristem, and then after 55 rotations by the 
divergence angle floret 89 was produced. But ^ % 33.99, which 
means that floret 89 was produced in a position almost exactly 34 full 
rotations from the position for floret 144. So there is a ring of the 
55 oldest florets (that is florets 90 - 144) arranged evenly around the 
outside perimeter of the flower. Then, just inside of these florets is 
another ring of the 34 next oldest florets (that is, florets 56 - 89) because 
floret 55 is next to and slightly inside of floret 89, and so on. Thus the 
florets whose ages are Fibonacci numbers were all produced at about the 
same exact position on the meristem. This is why when we count spirals 
on a flower we get Fibonacci numbers. 

Another property of Fibonacci numbers comes into play here. This 
has to do with the even spacing of the outer ring of the oldest florets. Let 
n be the number of florets in this outer ring (so n is a Fibonacci number). 
Since these florets are evenly spaced they were produced at n distinct 
positions that are also evenly spaced around the meristem. Number 
these n positions 1 through n in a clockwise direction starting with the 
position where the first floret was produced. Now the second floret was 
produced at a fixed angle from this position at position k + 1 where k is 
some integer. Then the third floret began at position 2k + 1, the fourth 
at position 3k + 1, and so on, where as we move around the circle we say 
position “mk + 1 ” when we of course really mean position nik+ 1 modulo 
n. The integer k is determined by the divergence angle since the 
even spacing means that k^- % So ~ ^ a, which is true if k and n 
are consecutive Fibonacci numbers. But consecutive Fibonacci numbers 
are relatively prime (see Problem 12.25), thus guaranteeing that each of 
the n positions 1, k+l, 2k+l, 3k+l, . . . , (n-\)k+l is distinct modulo?;. 

We now have a very elegant and satisfying model for plants and 
one that closely reflects what can be observed in actual plant growth. 
The question remains: Does this mean that plants have discovered 
this optimal angle through a process of natural selection? It is indeed 
tempting to believe that nature has in this way discovered the golden 
ratio by a long evolutionary path that gradually abandoned other less 
practical divergence angles along the way. 

However, as appealing as this idea may be, it is far more likely that 
the mathematics we see here is just a coincidence, and is in reality 
merely a consequence of the way in which a plant “decides” where on 
the meristem to produce the next floret. Let’s go back to Hofmeister's 
basic principle that a floret would be produced in the least crowded 
spot at the edge of the meristem. Our mathematical model based on 
a divergence angle is one way to achieve this goal, but how does a 
plant find this “least crowded spot”? Almost undoubtedly some sort 
of chemical mechanism has to be involved (this was first suggested 
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more than a century ago). It could be as simple as having a particular 
chemical present around the meristem that promotes growth of new 
florets, but then these new and growing florets deplete this chemical 
in the immediate area around them. For a good overview of the way 
in which plants grow, consult the following website on phyllotaxis: 
www.math.smith.edu/phyllo. 


Binet's Formula 

Let’s get back to the mathematical properties of the Fibonaci numbers. 
We have defined the Fibonacci numbers 1 , 1, 2, 3, 5, 8. 13, 21, 34, . . . , 
in terms of the recurrence relation F„ = F„_! + F„_ 2 because this is such 
a natural way to generate this sequence of numbers term after term. 
On the other hand, this definition is not especially useful if we want to 
find a particular Fibonacci number such as F 50 because we then have to 
find all forty-nine previous Fibonacci numbers just so we can finally add 
F 49 and F 48 to get F 5 o . What we need is a formula for the nth Fibonacci 
number. It turns out that there is such a formula — one that is not only 
beautiful but that even involves the golden ratio. (I told you the golden 
ratio just keeps showing up.) 

Because we will also soon wish to find a formula for the sequence 
1, 3, 4, 7, 11, 18. 29, 47, . . . , a sequence that follows exactly the same 
recurrence relation as the Fibonacci sequence, but simply begins with 1 
and 3 instead of 1 and 1, we state the following theorem without being 
specific about the first two terms in the sequence. 

Theorem 12.2. Ifso , Si, S 2 , S 3 , ... , is a sequence defined by the recurrence 
relation s„ = s„_i + s n 2 for n > 2, then 

s n = aa n + bp", 

for all n > 0, where a = and p = and where a and b are the 

solutions to the following system of simultaneous equations: 

So = a + b, 

S\ — aa + bp. 


Proof 

We give a proof by induction. First, note that since a and b are solutions 
to the system of simultaneous equations, the formula s„ = aa" + bp 11 
automatically holds for n — 0 and n = 1. Now, suppose n > 2 and 
assume that the formula holds for all nonnegative integers less than n. 
Then, using the induction hypothesis as well as the identities a 2 = a + 1 
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and p 2 = p + 1, we get 

s„ = s„_ i + s„_ 2 = 1 + bp"- 1 ) + (fla"“ 2 + bfi n ~ 2 ) 

— aa n ~ 2 (a + 1) + bp"~ 2 ( ft + 1) = aa n ~~ 2 {a 2 ) + bp n ~ 2 (p 2 ) 

= aa" + bp", 

as desired. This completes the proof. ■ 

Note that in this theorem the sequence begins with so rather than 
with si. This is done merely for convenience since it is somewhat easier 
to solve the system of equations So = a + b, S\ — aa + bp than it is 
to solve si = aa + bp, S 2 = aa 2 + bp 2 . But it does mean that we now 
need a Oth term for our Fibonacci sequence, so we define Fo = 0 (so that 
To + F\ = F2). 

The following formula for the nth Fibonacci number was proved by 
the French mathematician Jacques Philippe Marie Binet in 1843 and is 
now known as Binet's formula. However, it was conjectured in 1718 by 
Abraham de Moivre, and first proved in 1728 by Nicolas Bernoulli. 

Corollary 1 (Binet's formula). Let Fo, F 1 , F 2 , F 3 , ..., be the Fibonacci 
sequence 0, 1, 1, 2, 3, 5. 8, 13, 21 Then, for all n > 0, 

v «" - P n 

= -75-' 

that is, 


1 

(1 + V5 ) 

n 1 

f i-V5\ 

75 1 

l 2 ) 

1 

V 2 ) 


Proof 

All we need to do is solve for a and b in Theorem 12.2. In other words, 
we need to solve the following system of simultaneous equations: 

0 = a + b, 

1 = aa + bp. 


So b = -a, and then from the second equation we get a = ^ 
hence, b — —-^j. By Theorem 12.2, this completes the proof. ■ 


Since \p\ < 1, Binet’s formula gives us another proof of the important 
result in Theorem 12.1 that the ratios of consecutive terms in the 
Fibonacci sequence approach the golden ratio: 


T„ + 1 

lim — - 

n~*o o 


lim 

n — > 00 


v n + 1 



a NTl 

lim = a . 

n-* 00 ct n 
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Moreover, in practice, we can ignore the term in Binet’s formula 
since it approaches 0 as n gets large. In fact, as long as n is positive, F„ 
is the nearest integer to For example, for n = 7, ss 12.9846, so 

F7 = 13; and for n = 8, ^ « 21.0095, so F 8 = 21 . In this way we can 

easily use a calculator to compute F 50 = = 12 586 269 025. 

An astounding number of identities have been discovered concern- 
ing Fibonacci numbers, and some of these identities can be proved 
using Binet’s formula. One such identity involves squares of Fibonacci 
numbers. If you square two numbers in the Fibonacci sequence that 
have just one other Fibonacci number between them, such as 5 and 
13, and take the difference, the result is another Fibonacci number; for 
example, 13 2 — 5 2 = 169 — 25 = 144, that is, F 2 - F 2 = Fn', or, using 8 
and 21, we get 21 2 - 8 2 = 441 - 64 = 377, that is, F 2 - F 2 = F M . 
Therefore, the identity we wish to prove is 

F n+2 ~ F n = f 2(7+2, for II > 0. 

Using Binet's formula we write 

F n+2 ~ F n = \ K +2 - P" +2 ) 2 - l («" - n 2 

= j (a 2 " +4 - 2{ap) n+2 + p 2n+4 - a 2 " + 2 (a/3)" - ji 2n ) 

= i (a 2n+4 — a 2n — /3 2n + p 2n+i ) , 

where the last step holds because afi = -1, and so the two terms 
-2(aP)" +2 and 2(af5) n cancel each other out since n + 2 and n have the 
same parity. We continue where we left off by inserting a term {afi) 2 = 1 
into the middle of the last expression: 


r2 p2 1 (' 2(7+4 2(7 o2n . o2(7+4\ 

F n+2- F n-j{° 1 ~P +P ) 

= j (o' 2 " +4 - (afi) 2 a 2 " - (afi) 2 ? 2 " + p 2n+ 4 ) 
= j (a 2 - fi z ) (a 2 " +2 - /1 2 "+ 2 ) 

= - J = ( a 2 "+ 2 - ( 6 2 ' ,+2 ) = F 2 »+ 2 , 


since a 2 - fi 2 — {a - fi)(a + fi) = VS ■ 1 = V5. This verifies the identity. 
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Tiling and the Fibonacci Numbers 

The Fibonacci numbers can be thought of in many different ways. In 
this section we will represent them in a visual way by tiling alxn board 
using only lxl squares and 1x2 dominoes. The question is: For each 
n, how many ways are there to tile alxn board using only squares and 
dominoes? Let’s try a few boards and look for a pattern. 

For n — l, there is only one way to tile a 1 x 1 board: you have to use 
one square. For n = 2, there are two ways to tile a 1 x 2 board: you can use 
two squares, or a single domino. For n = 3, there are three ways to tile 
a 1 x 3 board: you can use three squares; or you can use one square and 
one domino where you have two options for placing that domino. For 
n = 4, there are five ways to tile a 1 x 4 board: you can use four squares; 
or you can use one domino where you have three options for placing 
that domino; or you can use two dominos. In Problem 12.17 you will 
be asked to show in a similar fashion that there are eight ways to tile a 
1 x 5 board and thirteen ways to tile a 1 x 6 board using only squares 
and dominoes. 

There is an unmistakable pattern here. This is the Fibonacci se- 
quence, and the proof of the following theorem explains why this is so. 

Theorem 12.3. Forn > 1, let t n represent the number of ways of tiling a 1 x n 
board using only squares and dominoes. Then 

tn — f/i+l • 

Proof 

Note that t\ = 1 = F 2 , and t 2 = 2 = F 3 . Now all we have to do is verify 
that the sequence {/,,} satisfies the same recurrence relation that the 
Fibonacci sequence satisfies. So suppose n > 2 and consider all possible 
tilings of a 1 x n board. There are t„ such tilings. 

If we now visualize each of these t n tilings, there is a natural way to 
split these tilings into two disjoint groups. In one group, we put all the 
tilings that begin on the left with a square. In the second group, we put 
all the tilings that begin on the left with a domino. How many tilings 
are in each group? 

In the first group, with a single square at the beginning, the remain- 
ing n— 1 positions can be covered in any way whatsoever with squares 
and dominoes, so there are t n i tilings in this group. Similarly, in the 
second group, with a domino at the beginning, the remaining n — 2 
positions are free to be covered in any way with squares and dominoes, 
so there are t„_ 2 tilings in this group. We conclude that 


tn — hi - 1 "F tn~ 2 , 
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which is exactly the recurrence relation satisfied by the Fibonacci se- 
quence. This completes the proof. ■ 

You may have noticed that tiling a 1 x n board with squares and 
dominoes is the same as forming an n-mora meter with light (L) and 
heavy (H) syllables. Therefore, Theorem 12.3 is equivalent to the result 
from 200 B.C. in Pingala’s Chandahsutra that the number of possible 
variations for a meter with n moras is the Fibonacci number F tt+] . 

One of the main reasons for representing the Fibonacci numbers 
as tilings is that this provides us with a tangible, and very visual, 
way of thinking about them. We can demonstrate the usefulness of 
such a representation by looking at the following well-known identity 
involving the Fibonacci numbers: 

F i + F 2 + F 3 + • • ■ + F„ = F n+ 2 — 1 . 

For example, for n = 6, we see that 1 + 1+ 2 + 3 + 5-1-8 = 20 = 21-1. 

Often with identities involving the Fibonacci numbers it is quite 
straightforward to verify the identity by induction (see Problem 12.19), 
but as we have mentioned before, proofs by induction rarely offer any 
insight into why an identity or theorem is true, and that is precisely 
the case here. Instead, we will offer a proof using tilings which has the 
significant benefit of providing insight into why this particular identity 
is true. 

We will describe the proof for the case n — 6, but the idea is 
completely general. First, using Theorem 12.3, we recast the identity 
F 1 + F 2 + F 3 + • • ■ + F 6 = Fg - 1 in terms of tilings and it becomes 
1 + h + t 2 + • • • + t s = t 7 - 1 ; that is, 

t\ + t2 + ■ ■ ■ + £5 = tj — 2 . 

The right side of this identity represents all tilings of a 1 x 7 board 
except for two particular tilings (exactly which two, we don’t know yet). 
Our job is to figure out which tilings could be represented by each of the 
numbers t\ through f 5 . This is easier than you might think. How many 
tilings have a domino in the first position on the left side of the board? 
There are exactly t 5 because placing a domino first on the left leaves five 
positions to be covered in any way with squares and dominoes. How 
many tilings have a domino that covers the second and third cells from 
the left in the 1x7 board? There are exactly U because there are still 
four positions to the right of this domino to be covered in any way with 
squares and dominoes, and the one position to its left must be covered 
with a square. 
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Continuing we see that is the number of tilings in which the 
first domino covers the third and fourth cells in the board (hence two 
squares are used to its left, but the three positions to the right of this 
domino can be covered in any way with squares and dominoes); t 2 is 
the number of tilings in which the first domino covers the fourth and 
fifth cells in the board; and is the number of tilings in which the first 
domino covers the fifth and sixth cells in the board. 

Finally, we can see the only two tilings missing on the left side of the 
identity: namely, the tiling in which the first domino covers the sixth 
and seventh cells on the board, and the tiling that has no first domino 
because it consists entirely of squares. In Problem 12.18 you are asked 
to “visualize” this Fibonacci identity by appropriately grouping the t 7 
tilings. 

In Problem 5.42 we saw that the Fibonacci numbers “appeared” in 
Pascal’s triangle as sums of numbers along certain diagonal lines. In 
particular, the Fibonacci numbers appeared as 1 = 1, 1 = 1, 1 + 1= 2, 
1+2 = 3, 1+3 + 1 = 5, 1 + 4 + 3 = 8 , 1 + 5 + 6 + 1 = 13, 
1 + 6 + 10 + 4 = 21, and so on. We can now explain why this is the case. 
In fact, it is nothing more than the following remarkable identity — 
an identity discovered by Edouard Lucas in 1876 — that relates binomial 
coefficients and Fibonacci numbers for n > 0: 



where the sum on the left continues as long as the terms make sense. 
So, for example, when n = 5 the sum is Q + (i) + ( 2 ) — 1 + 4 + 3, and 
when n = 6 the sum is (^) + ( 1 ) + ( 2 ) + ( 3 ) = 1 + 5 + 6 + 1 . 

Let’s now use tilings to understand why this identity is true. By 
Theorem 12.3, the right side of this identity represents all tilings of a 
lxn board. What does each term on the left represent? 

We can look at a specific example to get the answer. Let n = 6 . We 
will focus on the number of dominoes in each tiling. There is just one 
tiling of a 1 x 6 board that uses 0 dominoes. There are five tilings that 
use 1 domino. How many use 2 dominoes? This is a little harder, but if 
the first domino covers the first and second cells, then there are three 
options for the second domino; if the first domino covers the second 
and third cells, then there are two options for the second domino; and 
if the first domino covers the third and fourth cells, then there is only 
one option for the second domino; this is a total of six tilings. Finally, 
there is just one tiling that uses 3 dominoes. 

This example suggests that each binomial coefficient ("7') on the left 
in the identity should represent the number of tilings of a 1 x n board 
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that use exactly i dominoes. Let’s verify that this is indeed correct by 
counting the number of ways to place exactly i dominoes on a 1 x n 
board. Of course, it makes sense to try to do this only if i < | . 

Since each domino covers two cells, any tiling using i dominoes must 
use exactly n - 2 i squares. Altogether, then, we use i + (n - 2 i) = n — i 
pieces to tile this board. So a tiling of the board amounts to deciding 
in what order along the board to arrange these n - i pieces, of which 
exactly i are dominoes. Thus we are choosing among then- i “positions” 
in this order where to place the i dominoes; that is, we have precisely 
(V) options. 

Now we can see why there were six ways of placing 2 dominos in 
the example we looked at where n = 6 . With 2 dominos, there will be 
two squares, and hence a total of four pieces. So, it simply amounts to 
deciding in what order to place these four pieces, and there are (^) = 6 
choices: DDSS, DSDS, DSSD, SDDS, SDSD, SSDD. 

We now turn to a Fibonacci identity that is important enough to be 
labeled a theorem. 

Theorem 12.4. For all in > 2, n > 1, 

Fm+n = I'm - 1 i n + I'Ht F n I 1 ■ 

So, for example, for m = 4 and n = 5, we get F 3 F 5 + F 4 F 6 = 
2 • 5 + 3 ■ 8 = 34 and indeed, F 9 = 34; similarly, for m — n = 5, we 
get F 4 F s + F s F 6 = 3- 5 + 5- 8 = 55 = F w . 


Proof 

Since, by Theorem 12.3, F rn+n = t m+n -\, we see that F m+n represents all 
tilings of a 1 x (m+n - 1) board. Our goal is to separate these tilings into 
two groups so that F m F n+ 1 represents one group of tilings and F m _ 1 F n 
represents the other group. 

Let the first group be all of the tilings where a single domino does not 
cover the two adjacent cells m - 1 and m. For these tilings we see that 
the first m— 1 cells of the board are tiled with squares and dominoes, 
and then, independently, the remaining n cells of the board are tiled 
with squares and dominoes. Since there are f m _i = F m ways to tile the 
first part of the board, and t n = F n+ \ ways to tile the second part of the 
board, there is a total of F m F n+l tilings in this group. 

The second group of tilings will be those where a single domino 
does cover the two cells m - \ and m. For these tilings we see that 
the first m - 2 cells of the board are tiled with squares and dominoes, 
and then, independently, the n — 1 cells of the board following the 
domino covering cells m- 1 and mare tiled with squares and dominoes. 
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Since there are f,„_ 2 = F m _ 1 ways to tile the first part of the board, and 
f„_ i = F n ways to tile the second part of the board, there is a total of 
F m _i F n tilings in this group. This completes the proof. ■ 


If we consider the terms in a geometric sequence 1, r, r 2 , r 3 , r 4 , . . . 
it is obvious that the square of any term r k in this sequence is equal 
to the product of the two terms on either side — since (r k ) 2 = r 2k 
and r k ~ l ■ r k+x = r 2k . In view of Theorem 12.1, which tells us that 
the Fibonacci sequence behaves more and more like a geometric se- 
quence as n increases, it is not surprising that the Fibonacci sequence 
1. 1. 2, 3, 5, 8, 13, 21, 34, 55, . . . has a similar property: the square of 
any number in the Fibonacci sequence is almost exactly equal to the 
product of the two numbers on either side of it in the sequence. For 
example, 8 2 = 64 and this almost equals 5 • 13 = 65; or, for another 
example, 34 2 = 1156 almost equals 21 • 55 = 1155. In fact, as you might 
guess from these two examples, the square and the product of the two 
numbers always differ by 1. This property is usually stated this way: 

F 2 = F„ +1 F„_ 1 + (-l)" +1 . 

Note that this means that when n is odd the square is one more than the 
product (as for F 9 = 34), and when n is even the square is one less than 
the product (as for F 6 = 8). We will prove this property using tilings 
because this method of proof offers considerable insight into why the 
square and the product are so nearly equal. 

This proof uses a clever technique from a truly wonderful book, Proofs 
That Really Count: The Art of Combinatorial Proof by Art Benjamin and 
Jennifer Quinn. The technique is based on a common notion in tilings 
called fault lines. In our situation where we are assuming that all 1 x n 
boards being tiled are placed horizontally, a fault line is any vertical 
straight line that passes along the edge of a square or a domino in a tiling 
of the board; or, in the event that two tiled boards have been placed one 
above the other, a line does so on both boards. 

You can see that the two aligned 1x5 boards on the left in Figure 12.3 
have two fault lines: namely, the two lines passing on either side of the 
next-to-last square in the upper board. Similarly, the two boards on the 
right also have two fault lines. Figure 12.3 illustrates the technique of 
tail swapping used by Benjamin and Quinn that we need for our proof. 
When two boards are stacked as on the left in Figure 12.3, everything 
past the last fault line is called the tail for each board. So, in this case, the 
tail for the upper board is a single square and the tail for the lower board 
is a single domino. Benjamin and Quinn then swap these two tails to 
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Figure 12.3 Tail swapping after the last fault line. 


produce the two boards on the right: a tiling of a 1 x 6 board and a tiling 
of a 1 x 4 board. This process is called tail swapping. 

Two things should be clear about this process. Tail swapping has 
no effect on fault lines. In other words, any fault lines that exist for 
the original two boards are also the fault lines for the resulting two 
boards (in the same relative position). This means, in particular, that 
this process is completely reversible; for example, if we applied the tail 
swapping process to the two boards on the right in Figure 12.3, this 
would produce the same two boards we started with. We are now ready 
to verify this Fibonacci identity. 


Theorem 12.5. For all n > 2, 


F 


2 

n 


F n+ 1 F„_ i + 1 , ifnis odd; 
TV, + i-F„_i - 1. ifniseven. 


Proof 

Our basic strategy in this proof will be to try to prove that the two 
numbers F% and T„ + 1 T„_i are equal. We will almost, but not quite, 
succeed. 

What does the number F% represent? Since F n — ?„_! is the number of 
ways to tile a 1 x (n - 1 ) board, then F% represents the number of ways to 
tile two 1 x (// - 1) boards. So, for example, the number of ways we could 
tile the two boards on the left in Figure 12.3 is F|. Thus, since there are 
8 ways to tile each of these boards, there is a total of 64 ways to tile both 
of them. 

Similarly, the number F n+ \F n -\ also represents the number of ways to 
tile two boards: a 1 x n board and a 1 x (n — 2) board. For example, the 
number of ways we could tile the two boards on the right in Figure 12.3 
is F 7 • f 5 . Thus, since there are 13 ways to tile the upper board and 5 ways 
to tile the lower board, there is a total of 65 ways to tile both of them. 

therefore, in trying to prove that F% and F„ + iF„_i are equal, we will 
attempt to set up a one-to-one correspondence between the set of all 
tilings of two 1 x (n - 1 ) boards and the set of all tilings of two boards of 
size 1 x n and lx(« - 2). This is where tail swapping comes in. 
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Tail swapping produces the one-to-one correspondence we are look- 
ing for. Take any tiling of the two 1 x (n — 1 ) boards and arrange the 
boards as on the left in Figure 12.3. Then apply the tail swapping 
technique to these two boards. This produces exactly what we want, a 
tiling of two boards of size lx// and 1 x (// - 2). Furthermore, since 
this process is reversible, we know that it has produced a one-to-one 
correspondence. 

Now that we have a perfectly good one-to-one correspondence be- 
tween these sets, all that is left to do is make sure we didn’t miss any 
tilings in either set. For this last step we will consider two cases. 

If n is even, then // - 1 is odd, so any tiling of a 1 x (n — 1) board must 
use at least one square. This means when we arrange two 1 x (// — 1) 
boards as on the left in Figure 12.3, there must be at least one fault line. 
Therefore, in this case, the tail swapping one-to-one correspondence 
accounts for all of the tilings in the set with two 1 x (n - 1) boards. 
However, we still have to ask in this case whether the one-to-one 
correspondence produced all of the tilings of two boards of size 1 x/zand 
1 x (/7 - 2). Any tiling it might have missed would have to be a tiling that 
has no fault lines, that is, a tiling consisting entirely of dominos. And, 
in this case, since n is even, there is exactly one tiling of two boards of 
size lx// and 1 x (// - 2) consisting of nothing but dominos! Thus we 
conclude, for // even, that F% = F n+ \F n -\ - 1 . 

If // is odd, then the one-to-one correspondence does not miss any 
tilings of two boards of size lx« and 1 x (// - 2) because these tilings 
must all have at least one fault line. However, since // - 1 is even, there 
is exactly one tiling of two 1 x (// - 1) boards that does not use any 
squares, and uses only dominos; hence the tail swapping one-to-one 
correspondence never gets applied to this tiling. Thus we conclude, for 
// odd, that = F n+ \ F„_i + 1 . 1'his completes the proof. ■ 


Fibonacci Numbers and Divisibility 

The Fibonacci numbers have some remarkable divisibility properties, 
and we will look at several of these properties in this section. We have 
already mentioned that consecutive Fibonacci numbers are relatively 
prime (and you are asked to prove this fact in Problem 12.25). Note also 
that F 3 = 2, F s = 5, F 7 = 13, F n = 89, f 13 = 233, F l7 = 1597 are 
all prime, so it is tempting to conjecture that F p is prime whenever p is 
prime; unfortunately, Fig = 4181 = 37 • 113, so this conjecture fails. 

However, an interesting flip side to this failed conjecture is that for 
any prime p greater than 5 it is always the case that p divides either 
F p - 1 or F p+ 1 , but not both (for example, ll|Fio = 55, but ll/^Fu — 
144). In order to prove this property we will use Binet’s formula, 
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the binomial theorem, congruences, Fermat’s little theorem, and also 
Theorem 12.5. 

Let p be a prime such that p > 5. Using Binet’s formula and the 
binomial theorem, we get 


/ 1 + VSN 

l P ~^l 


V 2 ) 

v/5 1 

^ 2 J 



Looking at this last expression modulo p, the first thing we notice 
is that 2 P ~ X = 1 (mod p) by Fermat’s little theorem. This means 
that, modulo p, we can multiply through by 2 P ~ ] = 1 to cancel 
the 2 ^ _1 term. Next, recall from Chapter 5 that p divides each of the 
binomial coefficients (^), ( p ), (£), . . . , ( ^) (see Problem 5.40). Thus, 
since ( p ) = 1 , we can conclude that 

F p = 5 ^ (mod p). 

But then, F p = 5 p ~ l = 1 (mod p), again by Fermat’s little theorem, 
and now, since Theorem 12.5 gives us that F p+ \F p \ = F p - 1, we can 
see that F p+ iF p -\ = 0 (mod p); hence, as claimed, p|F p+] or p|F p _i. 
Finally, it is impossible for p to divide both of these numbers because if 
it did, then it would also have to divide F p (by the recurrence relation 
Fp_i + F p = F p+ 1 ), but then p would also divide F p _ 2 (again by the 
Fibonacci recurrence relation), and also divide F p - 3 , and so on, until p 
would divide F 2 = 1 . 

An especially interesting instance of this divisibility property of the 
Fibonacci numbers occurs for the twin primes 17 and 19 because, as it 
happens, 17 1 Fig and 19 1 Fig since Fig = 2584 = 2 3 ■ 17 ■ 19. 

Let’s see what else we can discover about the Fibonacci numbers 
and divisibility. Well, since the very first two Fibonacci numbers are 
odd, we can immediately see which Fibonacci numbers will be even 
and which will be odd simply because of the familar rules “odd plus 
odd is even” and “odd plus even is odd”; so the pattern in the Fi- 
bonacci sequence has to be odd, odd, even, odd, odd, even, . . . , and 
it is. In the Fibonacci sequence, every third number is divisible by 2: 
1, 1. 2, 3, 5, 8 , 13, 21, 34, 55, 89, 144 
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Which Fibonacci numbers are divisible by 3? This time it is not nearly 
so obvious why it happens, but there does seem to be a very clear pat- 
tern: 1, 1, 2, 3 , 5, 8, 13, 21 , 34, 55, 89, 144, 233, 377, 610, 987 , It 

certainly looks as if every fourth number is divisible by 3. This is indeed 
the case as we can prove by writing the brst few terms of the Fibonacci 

sequence modulo 3: 1, 1, 2, 0, 2, 2, 1, 0, 1, 1, 2, 0, 2, 2, 1, 0 

Since this sequence of remainders modulo 3 has just repeated itself in a 
cycle of length 8 — 1, 1, 2, 0, 2, 2, 1, 0 — we can be sure that it will do so 
forever because of the Fibonacci recurrence relation. Hence every fourth 
Fibonacci number is divisible by 3 (and only those Fibonacci numbers 
are divisible by 3). 

So far, the pattern we have seen is that every third Fibonacci number 
is divisible by the third Fibonacci number f 3 = 2, and every fourth 
Fibonacci number is divisible by the fourth Fibonacci number F 4 = 3. 
Perhaps every fifth Fibonacci number is divisible by the fifth Fibonacci 
number F 5 = 5. Well, it looks like the pattern might still be holding: 
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 

4181, 6765 In Problem 12.27 you will be asked to verify that this 

is indeed the case by showing that the sequence of remainders for the 
Fibonacci sequence modulo 5 repeats itself in an appropriate way. 

At this point we are fairly confident that any Fibonacci number F n 
will divide each of the Fibonacci numbers Fk„ for k = 1, 2, 3, ... . 
Moreover, we believe these will be the only Fibonacci numbers that F n 
will divide. In other words, we have the following conjecture: 

F„\F m if and only ifn|m, for all n> 2. 

We will prove this conjecture shortly, but it turns out that this 
divisiblity property of the Fibonacci numbers is closely connected to 
another remarkable divisibility property — one involving the greatest 
common divisor of two Fibonacci numbers. Note that for f 8 = 21 and 
F 12 = 144, we have gcd(P 8 , f 12 ) = 3 = F 4 , and 4 = gcd(8, 12). In other 
words, in this case, gcd(f 8 , T 12 ) = Pgcd( 8 .i 2 ) • This is not a coincidence, 
but is in fact true in general: 

gcd (F m , F n ) = Tgcd(m.tt) ■ 

In fact, these two divisibility properties are so closely linked that they 
are very reminiscent of the famous lithograph of M. C. Escher called 
Drawing Hands in which two hands, a right and a left, have seemingly 
almost finished drawing one another. In the same way, in what follows, 
we will seem to use each of these two divisibility properties to prove the 
other. 
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To emphasize how closely linked these two properties are, we will first 
state them both as theorems and then give a “simultaneous” proof of 
both theorems. 

Theorem 12.6. Forn > 2, 


F n \F m if and only if n\m. 
Theorem 12.7. For any positive integers m and n , 

gCd(F m , F „) = Tgcd(m,n)- 


Proof of both theorems 

This really is a proof. Here is its logical structure, which has three parts: 
first we prove “half” of Theorem 12.6, then we use that “half” to prove 
Theorem 12.7, and finally, we use Theorem 12.7 to prove the remaining 
“half” of Theorem 12.6. (Can you sense the two hands drawing one 
another here?) 


Part 1. First, we prove that for any n > 1, F n j F kr , for all k > 1. In 
other words, we are proving that if n\m, then F„\F m (this is “half” of 
Theorem 12.6). For this, we use induction. 

Since F„\F„, this statement is true for k = 1 . Now, assume that this 
statement is true for some integer k > 1; that is, assume Fn\F kn - By 
Theorem 12.4, we have 


F(k+l)n — Fkn+n — F kn -]F n + F\nFn+ !• 


So, since F„\F n and F n \F kn , we conclude that F n \F {k+ i )n , as desired. This 
completes the proof of Part 1 . 


Part 2. Now we prove Theorem 12.7. One way to show that two num- 
bers are equal is to show that they divide one another. That is the 
approach we take here. 

We first show that f 7 gc d(m,n)|gcd(,F,„, F n ). Let d = gcd (m, n). By Part 1, 
F d \F m and F d \F n . Therefore, T gC d(m.«)|gcd(F m . F„). 

Now we write d as a linear combination of m and n: d = xw + yn, and 
then by Part 1 let F xm = aF m and F yn = bF„. By Theorem 12.4, we have 

F d = F xm+yn — F xm—\F yn T F xm F y n +\ — (F xm . \b)l : „ T (F yn+ ia)F m . 
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Therefore, F c i is a linear combination of F„ and F m . Hence any com- 
mon divisor of F n and F,„ will divide F rf ; and so, in particular, 
gcd(F,„. F n ) | F g cd(m,n) • This completes the proof of Part 2. 

Part 3. Finally, we prove the other “half” of Theorem 12.6. Suppose that 
F n \F m (and that n > 2). We need to prove that n\m. 

Since F„\F m , it follows that gcd(F„, F m ) = F n . But, by Theorem 12.7, 
gcd(F,„ F m ) = F g cd(H, m) ■ Therefore, F n = F gcd(njn) , and for n > 2 this 
implies that n = gcd(«, m), and so we conclude that n\m. (Note that if 
n — 2 this argument fails since F 2 = F go d( 2 ,is), and yet 2 ^ gcd(2, 15).) 
This completes the proof of Part 3, and therefore the proof of both 
theorems. _ 


Generating Functions 

In this section we introduce a powerful tool for finding formulas — such 
as Binet’s formula — for sequences of numbers. This tool is called gen- 
erating functions. In his outstanding book about generating functions, 
generatingfunctionology, Herbert Wilt describes a generating function 
metaphorically as “a clothesline on which we hang up a sequence of 
numbers for display.” 

Here is the “clothesline” (i.e., the generating function) for the Fi- 
bonacci numbers: 

0 + lx + lx 2 + Zx 3 + 3x 4 + Sx s + 8* 6 + 13x 7 + • • • . 

How can this be a useful thing to do? All we have really done is hang 
each Fibonacci number F n on the clothesline in its position as the 
coefficient of x". The reason this turns out to be useful is that this 
clothesline is now an algebraic object that we can manipulate using the 
ordinary rules of algebra. 

So, a generating function is an infinite series, but try not to think of it 
as a function. We are not concerned with whether it converges. We will 
not evaluate it for particular values of x (in fact, there is no domain in 
sight). To emphasize that our point of view here is purely algebraic, we 
often refer to these objects as power series. 

Let’s begin by doing some algebra on the generating function (i.e., 
the power series) for the Fibonacci sequence. Let the power series for 
the Fibonacci sequence be given by 


f(x) = Fq + F x x + F 2 x 2 + F 3 x 3 + F 4 x 4 + ■ ■ ■ . 


348 


Chapter 12 


Then, we can use the recurrence relation F„ = F„_i + T„_ 2 and the fact 
that F 0 = 0 and F\ — 1 to write 

fix) -x = F 2 x 2 + F 3 x 3 + F A x 4 + ■■■ 

= (F 0 x 2 + F A x 3 + F 2 x 4 + •••) + (P\x 2 + F 2 x 3 + F 3 x 4 + • • • ) 
= * 2 f(x) + xf(x). 

Solving now for f(x), we get (1 - x - x 2 ) f(x) = x; and so 


Therefore, 1 _ x _ x2 is the generating function for the Fibonacci sequence. 

At this point we are going to pause for a moment because we are 
preparing to use a method called partial fractions that you may have seen 
before (perhaps in a calculus course). This is a method— really just an 
algebraic trick— to change the form of an expression. 

Here is a simple example. It is easy to see that + ^5 = xHix-i 
simply by finding a common denominator on the left and so the 
numerator becomes 2(x + 3) + l(x - 1) = 3x + 5. 

However, if we begin with the expression we could a l so 

produce the two fractions on the left (the “partial fractions”) by first 
factoring the denominator as x 2 + 2x - 3 = (x - l)(x + 3) and assuming 
that x 2 3 + 2 x ~3 ~ yvy + ypy f° r some numbers a and b, and getting a 
common denominator on the right which then yields as a numerator 
a{x+3)+b{x- 1). Since the two numerators must be equal, this produces 
two simultaneous equations {a + b = 3 and 3 a - b = 5), which we can 
solve to find that a — 2 and b — 1 . 

Why is this useful? Well, suppose you want to evaluate the integral 
J x 2 + 2 x-'i d x ■ Since it is easy to integrate functions such as - 2 -j and ^ 3 , 
you use the method of partial fractions to change the form of to 

get 


/ 3x + S dx = [- 2 v dx+ /— !— dx = 2 1n(x-l) + ln(x + 3) + C. 
J x 2 + 2x-3 J x - 1 J x + 3 

In the example above, having a partial fraction with a linear term 
in the denominator is useful because the integral is then a logarithm. 
When working with generating functions, having a partial fraction 
with a linear term in the denominator is also useful, but for a very 
different reason. In this case it is useful because such a fraction can be 
expressed as a geometric series. For instance, we can immediately write 
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the fraction yV^ as 

= l+2x + (2x) 2 + (2x) 3 + (2x) 4 + ■■■ . 

Now, back to X _f_ x2 , the generating function for the Fibonacci se- 
quence. Recall that cr + p = 1 and af = -1, where a = and p = 
thus we can factor the denominator as 1 -x-x 2 = (l-ax)(l-j8x). 
Then, the method of partial fractions allows us to rewrite the generating 
function for the Fibonacci numbers as 


m = 


X 

1-x-x 2 


x _ x / a p \ 

{l-ax){\-px) a—p Vl-crx 1 — px)' 


(You can check this by finding a common denominator for - yf ~ .) 

But a - p — s/5 , and the two terms in the final parentheses can each be 
expanded as geometric series, so we get 


f(x) = — 7=- (»(1 +ax + a 2 x 2 + ...)- p{\ + px + p 2 x 2 + . . . )) 


Vs 

a- p a 2 - p 2 


Vs 


Vs 


x 2 + 


a 3 - P 3 , 

Y° 

Vs 


Y + ■ 


Therefore, since these are just the Fibonacci numbers hanging on the 
clothesline, we conclude that the nth Fibonacci number is given by 

r a" - P n 


where a = and p = _ 

This of course is Binet’s formula. A major advantage of deriving 
this formula using generating functions is that the formula appears 
as part of the process. In Problem 12.15 you will be asked to prove 
Binet’s formula using induction. This is quite a simple proof, but the 
disadvantage is that it requires that you somehow know the formula in 
order to prove it. 


Problems 

12.1 * (a) Solve Fibonacci’s famous “rabbit problem.” Make sure you 

understand the basic premise of the problem, which is that a pair 
of baby rabbits (assumed to be a male and a female) take a month 
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to become sexually mature, after which time they begin to 
“breed like rabbits.” Also, since the problem does not explicitly 
state whether the original pair of rabbits is mature or not, there 
are really two slightly different versions of this problem to think 
about. 

(b) Explain why Fibonacci’s model for rabbit reproduction produces 
the Fibonacci sequence by describing clearly how to interpret the 
recurrence relation F n = F n ~\ + F„_ 2 in this model. 

12.2 (S) (a) Use the list of all possible three-meter moras and the list of all 

possible four-meter moras to construct a list of the eight possible 
five-meter moras. 

(b) Explain in general why the number of possible n-meter moras is 
the sum of the possible number of {n - 2)-meter moras and the 
possible number of ( n - l)-meter moras. 

12.3 (S) Even though we normally think of the Fibonacci sequence as 

1, 1, 2, 3, 5, 8, 13, 21, ... , which starts with the two numbers F \ = 1 
and F 2 = 1 , we saw in Theorem 12.2 that it is sometimes convenient 
to put a zero at the beginning of this sequence by defining Fo = 0 in 
order to satisfy the recurrence relation F 2 = Fi + Fo. Similarly, 
Fibonacci numbers F„ can be defined for negative values of n. 

(a) Find F_ 1 , F_ 2 , F_ 3 , .... F_ 8 . 

(b) Write down a formula for F_„. 

12.4 (H,S) In early Egyptian mathematics the only kind of fractions that were 

ever used are what we call unit fractions, that is, fractions of the form g 
where n is a positive integer (the only exception to this in Egyptian 
mathematics is the single fraction |, which had a special symbol). 
Needless to say this created certain challenges from a computational 
point of view. For example, Problem 24 on the Rhind Papyrus shows 
how to “compute” ^ , that is, how to divide 19 by 8, and the result is 
the sum of 2, | , and g . 

The Rhind Papyrus, now in the British Museum, dates from around 
1650 B.C. and is nearly eighteen feet long and about a foot wide. 
Almost a third of one side of this famous artifact is devoted to a table 
of the doubles of all the unit fractions with odd denominators from | 
to ^ represented as a sum of unit fractions (as well as a computation 
for each fraction demonstrating that if you multiply the sum by the 
denominator you in fact get 2). For example, the double of is given 
as the sum of ^ , gj- , and gg . But where did this decomposition come 
from? 
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Here is how an Egyptian scribe did this. Starting with 17 as a whole 
quantity (that is, as 1 part), the § part is 11 + ~ (because two-thirds of 
18 is 12), so the g part is 5 + |. Then the g part is 2 + \ + g, and the ^ 
part is 1 + 5 + g. At this point the scribe can see that the difference 
between 2 (the double he is seeking) and this sum — that is, 

(1 + \ + |) - (1 + \ + g) — is 5 + 5. But since 4 x 17 = 68, | is a i part 
of 17, and since 3 x 17 = 51, g is a A- part of 17. This yields the desired 
decomposition A = + gk- 

However, there is no general pattern that was followed for 
producing the particular decompositions found on the Rhind 
Papyrus. In 1202 Fibonacci discovered an algorithm for decomposing 
rational numbers into sums of unit fractions. Here is how his 
algorithm works on the fraction -A. First, find the smallest integer n 
such that g < This integer is just [y] = 9 (since 5 < -fg < g). So \ 
is the first unit fraction in the sum. Now, you just repeat this same 
process on the remainder ^ - |. But in this case, ~ ~ 5 = yh is 
already a unit fraction so we are done and -A = 1 + Ag. (Note that the 
Egyptians preferred their decomposition to this one even though it 
has more terms; presumably this is because the denominators in their 
decomposition are smaller, and also because the first denominator in 
their decomposition is even and they seemed to like even 
denominators.) 

(a) Eet l be any rational number between 0 and 1, and let n be the 
positive integer for which 

1 a 1 

- < - < . 

n b n - 1 

Prove that g - ~ is a nonnegative rational number less than 1 
whose numerator is less than a. Then explain why this shows 
that Fibonacci’s algorithm will eventually produce a 
decomposition for f as a sum of distinct unit fractions. 

(b) Use Fibonacci’s algorithm to find unit fraction decompositions 
for | and 

(c) Although there is no general pattern for all the decompositions 
on the Rhind Papyrus, there is an apparent pattern for those 
decompositions where the denominator n is divisible by 3. For 
example, the decomposition for | is | + anc } the 
decomposition for is A, + ^. Find and verify a simple 
algebraic identity that produces such decompositions 
whenever n is divisible by 3. 

(d) Several of the decompositions on the Rhind Papyrus use four 
unit fractions, though none use more than four. All of these can 
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be done using fewer than four, however, except for the very last 

° ne: T§T = W + 2 S 2 + 3M + 6k - Find and verif y 3 Sim P le 

algebraic identity that produces for any positive integer n a 
decomposition of | as a sum of four distinct unit fractions. 

12.5 * Using a calculator, we find that a = 1 .618034, and so 

1 - a • 1 = -.618034. In a similar fashion compute each of the terms 
in the linked sequence l-al,2 — al,3 — or • 2, 5 — a - 3, 8 — a - 5, 
13 -a -8. 

Then compare this to the geometric sequence 1 - a, (1 — a)p, 

(1 - a)p 2 , (1 - (1 - u)P 4 , (1 - a)/) 5 . 

12.6 The recurrence relation F„ = F„_ i + F„_ 2 for the Fibonacci sequence is 
called a second-order recurrence relation because it generates each 
succeeding term in the Fibonacci sequence using two previous terms. 
Similarly the recurrence relation F„ = aF n i + p u ~ A is a first-order 
recurrence relation because it needs only one previous term to 
generate each succeeding term in the Fibonacci sequence. 

In general, first-order recurrence relations are easier to work with: 
for example, we saw in Theorem 12.1 how useful this first-order 
recurrence relation could be to help us prove an important property 
of the Fibonacci sequence. This first-order relation is not, however, 
very useful for generating the Fibonacci sequence itself, as we will see 
in this problem. We will use it in two different ways to generate the 
Fibonacci sequence and discover that neither method is very 
satisfactory. 

If we begin with F\ = 1, then we can use F„ = aF n -\ + p"~ ] to 
compute F2 = a ■ 1 + = a + ft = l.ln order to continue it will be 

useful to recall that a and p are the two solutions of the equation 
x 2 - x - 1 = 0, and therefore p 2 = ft + 1 . Thus we get 
F-s = ot-\+p 2 =aFP + \ = \ + \ —2. 

(a) By continuing in this fashion (and always reducing powers of p 
using p 2 = p + 1), use the first-order recurrence relation 

F n = + p n ~ l to compute F4 and F s . 

(b) By actually using the values a — anc j p = use the 
first-order recurrence relation F n — pF n _ 1 + a" -1 to compute F 3 
and F 4 by hand. Then use a calculator to compute F 7 given that 

F 6 = 8. 

12.7 * (a) Confirm Theorem 12.1 by computing (say) the first ten ratios 

of consecutive terms of the Fibonacci sequence and 
verifying that these ratios do seem to be converging to a 
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number slightly less than 1.62, and that for odd values of n 
these ratios are increasing while for even values of n these ratios 
are decreasing. 

(b) The sequence of ratios of consecutive terms of the Fibonacci 
sequence in fact converges rather quickly. Determine how far 
out in this sequence you have to go before you know that the 
number to which these ratios converge can be correctly 
approximated by 1.618034. 

12.8 (H) To construct a line segment of length equal to the golden ratio 

a = begin with a line segment AB of length 1. At B draw a line 
perpendicular to AB and let C be the point on this line directly above 
B such that BC also has length 1. Let M be the midpoint of AB. Using 
M as the center and MC as the radius, draw a circle and let D be the 
point where this circle intersects the line AB extended beyond B. 
Prove that the line segment AD has length a = -kb/I. 

12.9 (H) Let ABCDE be a regular pentagon such that each side has length 1. 

Prove that the diagonal line segment E B has length equal to the 
golden ratio a = 

12.10 Using a calculator to compute the golden ratio a = we g e t 

a — 1 .618 033 988 .... If we square a and also invert a, then we have 
the following three numbers: 

a = 1.618 033 988 .. . , 

a 2 = 2.618 033 988 .. . . 

- = 0.618 033 988 .. . . 
a 

Can you explain the surprising fact that, beginning with the golden 
ratio, the decimal part is exactly the same in all three of these 
numbers? 

12.11 In this problem we explore one of the reasons why people are 
attracted to the idea that the golden ratio was an intentional part of 
the design of several of the pyramids in Egypt. There is a right 
triangle, called the Kepler triangle — recall that astronomer Johannes 
Kepler referred to the golden ratio as the “precious jewel”— such that 
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the three sides of the right triangle are in a geometric progression: 

1, r, r 2 (where r > 1). Thus, by the Pythagorean theorem, r 4 = r 2 + 1, 
and if we let x = r 2 , this becomes our familiar quadratic equation 
x 2 - x - 1 = 0, and the positive solution is cr = i+ ^ . So the three sides 
of the Kepler triangle are 1, ^/a, and a. 

Now, the “Kepler triangle” that has been discovered at the Great 
Pyramid of Giza has the following dimensions. One leg of this 
triangle is the height of the pyramid, which is 481.2 feet. The other 
leg of this triangle is the perpendicular distance from the center of the 
base of the pyramid to one of the sides of the base, which is 377.89 
feet. The hypotenuse of this triangle is the altitude of a face of the 
pyramid, and its length is 611.85 feet. 

Compare the actual ratios of the three sides of this triangle with the 
ratios of the three sides of a Kepler triangle, then comment briefly on 
whether you think this is merely a coincidence or evidence that the 
builders of the Great Pyramid of Giza were intentionally using the 
golden ratio in the construction of this pyramid. 

12.12 Go to the website www.fibonacci.name/gallery.html. This picture 
gallery has many wonderful examples of various plants, foods, and 
animals that may or may not exhibit Fibonacci numbers or the 
golden ratio such as tulips, aloes, palm trees, dandelion blossoms, 
asparagus spears, broccoli, and rams’ horns. Find several examples in 
each direction, that is, examples that illustrate the existence of 
Fibonacci numbers in nature, as well as examples that do not. 

12.13 In our discussion of plant growth we only presented an argument that 
the golden ratio provided us with an excellent number upon which to 
base a divergence angle, not that it was in any sense the best number 
for this purpose. It turns out that the reason the golden ratio gives the 
best possible distribution of florets in a flower is because its continued 
fraction converges as slowly as possible. This is discussed in some 
detail in M. Naylor, “Golden, V2, and n Flowers: A Spiral Story,” 
Mathematics Magazine 75(3) (June 2002), 163-72, which you can 
access at http://www.michaelnaylor.com/resources/naylor-seeds.pdf. 

Draw a picture that simulates the growth of a flower if the 
divergence angle is 45° (recall that as the florets move radially 
outward they slow down). Do the same thing if the divergence angle 
is 54°. Explain why these angles do not produce a good distribution of 
florets. 

Then refer to Naylor’s paper to see what happens when the 
divergence angle corresponds to irrational numbers such as or ;r. 
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Which of these two numbers seems to produce a distribution as good 
as the one produced by the golden ratio? 

12.14 We used Binet’s formula to find f 50 = 12 586 269 025. In order to 
fully appreciate the usefulness of Binet’s formula, verify that this is 
correct by finding the first fifty Fibonacci numbers recursively. 

12.15 * (H) Use induction to prove Binet’s formula. 

12.16 (S) Use Binet’s formula to prove Theorem 12.5. You should split this 

into two cases depending on whether n is odd or even (this allows you 
to control when a power of afi = -1 is +1 or -1). 

12.17 * By listing all possibilities, determine how many ways there are to tile a 

1 x 5 board and how many ways there are to tile a 1 x 6 board using 
only squares and dominoes. 

12.18 Illustrate our proof of the Fibonacci identity 

F\ + F 2 + F 3 + ■ • • + F n = F n+ 2 — 1 


by drawing all twenty-one tilings of a 1 x 7 board and grouping all 
except two of these tilings into groups of sizes fi through f 5 . 

For the two tilings that you excluded from your groupings, which 
one most naturally corresponds to F x = 1 in the Fibonacci identity? 
Which most naturally corresponds to the -1 term in this identity? 

Some authors define tp = 1 to correspond with F x ; this can be 
thought of as the number of ways of tiling a 1 x 0 board (the only way 
to tile this board is to use the empty tiling: select exactly 0 squares and 
0 dominoes). 

12.19 (H,S) We used our proof of the identity 

F\ + F 2 + F 3 + ■ ■ ■ + F„ = F n+ 2 - 1 


as an introduction to the notion that tilings are a useful way to 
represent Fibonacci numbers; but there are three other simple ways of 
proving this same identity. 


(a) Use induction to prove this identity. 

(b) Use Binet's formula to prove this identity. 
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(c) Show how this identity follows directly from the recurrence 
relation F n = F„_ i + F „_ 2 by writing 
Fi — F3 — F2 
F 2 — F 4 — F 3 

Fn — F n +2 — F n + 1- 

12.20 ★ (S) Use induction to prove that for all n, 

Fl + F\ + F| + • • • + F„ 2 = F n F n+ 1 . 

Then give a visual, geometric proof of this identity by representing 
the numbers on the left side by actual squares and the number on the 
right side as a rectangle of dimension F„ by F n+ 1 . 

12.21 (H) Prove the following identity in three different ways: 


F\ + F3 + F s • • • + F2n+\ = F2n+2- 


(a) By induction. 

(b) By using an algebraic trick as in part (c) in Problem 12.19. 

(c) By counting the number of tilings of a 1 x (2 n + 1 ) board. 

12.22 Prove the following identity in three different ways: 

F2 + F4 + F(, ■ ■ ■ + F21, = F2H+1 ~ 1 • 

(a) By induction. 

(b) By using an algebraic trick as in part (c) in Problem 12.19. 

(c) By using the identity in Problem 12.21 together with the 
identity in Problem 12.19. 

12.23 (S) Prove the following identity that relates binomial coefficients and 

Fibonacci numbers: 



by counting the number of tilings of a 1 x ( 2 n - 1 ) board, and by 
considering, for each i where 1 < i < n, those tilings that have i 
squares among the first n pieces. 
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Figure 12.4 


12.24 There is a long history of geometric dissection problems in 
mathematics. Figure 12.4 is a famous example that was discovered 
around 1830 by Henry Perigal, a London stock broker and amateur 
astronomer. It provides a proof of the Pythagorean theorem since it 
shows literally that the square on the hypotenuse of a right triangle is 
the sum of the squares on the two legs. 

Another very famous geometric dissection problem is based upon 
the identity in Theorem 12.5, although in this case we present a 
paradox rather than a proof. We will present this paradox for the 
consecutive Fibonacci numbers 5, 8, and 13 (but any three 
consecutive Fibonacci numbers will work; in fact, the bigger the 
numbers, the better the paradox). If we dissect an 8 x 8 square as 
shown in Figure 12.5, then we can reassemble the pieces to form the 
5 x 13 rectangle seen in Figure 12.6. Where does the extra square unit 
come from? 

12.25 * Give a proof by contradiction to show that any two consecutive 

Fibonacci numbers F„_i and F n are relatively prime. 

12.26 We saw that p = 19 is the first prime for which F p is composite. For 
each prime p such that 20 < p < 50 determine whether F p is prime. 

12.27 * (H) Prove that it is every fifth number, and no others, in the Fibonacci 

sequence that is divisible by the fifth Fibonacci number F s = 5 by 
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8 5 



Figure 12.6 

writing the Fibonacci sequence 1, 1. 2, 3, 5, 8. 13, 21, 34, 55. . . . , as 
a sequence of remainders modulo 5. 

12.28 (H,S) Use Theorem 12.6 to prove that for any positive integer n, there are 

n consecutive Fibonacci numbers all of which are composite. 

12.29 (S) One of the most useful ways for representing positive integers is in 

terms of their prime decomposition, and all the more so because this 
representation is unique. In this problem we will see that positive 
integers can also be represented in terms of Fibonacci numbers. 

We can write each of the numbers 20 and 30 as a sum of Fibonacci 
numbers as follows: 20 = 13 + 5 + 2 and 30 = 21 + 8 + 1 . As in these 
two examples, we prefer that sums consist of distinct Fibonacci 
numbers so as to avoid uninteresting sums such as 
20 = 1 + 1 + 1+ -- - + 1. Similarly, we will require that sums consist of 
nonconsecutive Fibonacci numbers in order to avoid trivial reductions 
such as changing 30 = 21 + 8 + 1 to 30 = 13 + 8 + 5 + 3 + 1 . 

(a) Write each of the numbers 50 and 225 as a sum of distinct 
nonconsecutive Fibonacci numbers. 

(b) Prove that every positive integer can be written uniquely as a 
sum of distinct nonconsecutive Fibonacci numbers. 
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One of the most remarkable things about the Fibonacci numbers is 
the number of different contexts in which they appear. The next 
several problems illustrate this point. 

12.30 * (H) Show that the number of subsets of the set {1, 2, 3 n) that 

contain no pair of consecutive numbers is a Fibonacci number. 

For example, for the set{l, 2, 3, 4), none of the following eight 
subsets contains a consecutive pair of numbers 

0, {1}. {2}, {3j, {4}, {1, 3), {2, 4}, {1, 4}. 

and 8 is a Fibonacci number. 

12.31 (H,S) Here is a well-known variation of Problem 12.30. Suppose you 

have n chairs in a row. How many ways can you seat men and women 
in these chairs with no two women sitting next to each other? Verify 
your answer. 

12.32 ★ (H) Since the continued fraction 


1 

1 + 

1 


converges to the golden ratio a, it wouldn’t be too surprising to find 
the Fibonacci numbers lurking around somewhere. Recall from 
Chapter 4 that if 


<7i + 


1 


<?2 1 + 


1 


<7»-i + 


1 


<7 n + 


is a continued fraction with partial quotients q j , q 2 q„, . . . , 

then the terms 


<?i , <?i + 


<?2 


<7i + 


<7i + 


<?2 1 

<?3 


<72 + 


<73 + 


<74 


are called the convergents for the continued fraction. 
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Find the first four convergents for the continued fraction above, 
and then prove that if C \ , c 2 , C3, . . . , are the convergents for this 
continued fraction then c n = —L 1 for all n. 

Note that then, by Theorem 12.1, we know that the sequence of 
convergents indeed converges to the golden ratio a. 

12.33 (S) Let M be the matrix ^ j ^ . Compute M 2 , M 3 , and M 4 . Then find, 

and verify, a general formula for AT". 

12.34 The complex number i satisfies the quadratic equation x 2 + 1 = 0. 
This means that any complex number, no matter how high the 
powers of i that are involved, can always be reduced to a number of 
the form a + bi. For example, the complex number 7 + i 2 + 3 z' 3 + 4;' 6 
reduces to 2 - 3/. 

Similarly, since the golden ratio a satisfies the quadratic equation 
x 2 - x - 1 = 0, any power of a can always be reduced to a number of 
the form aa + b. For example, since a 2 — a + 1, we can in turn reduce 
a 3 as follows: a 3 = a{a + 1) = a 2 + a = (a + 1) + a = 2a + 1 . 

In the same way reduce a 4 and a 5 to the form aa + b. Then find a 
general formula for a", and verify your formula. Finally, verify a 
similar formula for p". 

12.35 + (S) We mentioned at the very beginning of the chapter that it was the 

French mathematician Edouard Lucas who first gave the Fibonacci 
sequence its name. The sequence 

1, 3. 4, 7, 11, 18. 29, ... 

is now called the Lucas sequence. It behaves very much like the 
Fibonacci sequence in that each term is the sum of the two previous 
terms; that is, the sequence is defined recursively by 

L n = L n — 2 T L ri _\ 

for n > 2, and L\ = 1, L 2 = 3. (Sometimes, in addition, for 
convenience, L 0 is also defined as L 0 = 2.) 

(a) As with the Fibonacci numbers, there are many patterns to 
discover among the Lucas numbers. One of the easiest patterns 
to spot is that if we add every other Fibonacci number, then we 
always get a Lucas number. For example, 1 + 3 = 4, 2 + 5 = 7, 

3 + 8 = 11, 5 + 13 = 18, 8 + 21 = 29, and so on. 
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Use induction to prove that for all n > 1, 
fi n — F n — l + 1- 

(b) Because the Fibonacci sequence and the Lucas sequence are 
both defined by the same recurrence relation, for every 
property you can find among the Fibonacci numbers, you can 
always find a similar property among the Lucas numbers. 
Complete the following four identities involving Lucas 
numbers (you need not prove any of these identities). 

Fn - 1 + L/i+i = ■ L\ + L 2 + L 3 + ■ ■ ■ + L„ — ? 

fin-1 ' fin+1 = ? fi? + fi 2 + i? + • • • + fi“ = ? 

(c) In Problem 12.3 we extended the Fibonacci sequence to the 
negative integers; that is, we defined F_ 1 , F_ 2 , fi-3, .... Do the 
same thing for the Lucas sequence by finding the first few Lucas 
numbers L . 1 , L_ 2 , fi- 3 , and so on, and then find a formula for 
L_„ in terms of L„. 

12.36 *(H,S) 

(a) Use Theorem 12.2 to find a formula for the nth Lucas 
number L„. 

(b) Use Binet’s formula along with the formula L n = T„_i + F„ +1 
from Problem 12.35 to verify the formula you found for L n in 
part (a). 

(c) Use the formula in part (a) to prove that if L\, L 2 , L 3 , . . . is the 

Lucas sequence, then the sequence { } of ratios of 

consecutive terms converges to a = the golden ratio, as n 
goes to infinity. 

This last result shows that the amazing property of the 
Fibonacci sequence in Theorem 12.1 depends only on its 
recurrence relation, and not at all on the values of the first two 
terms of the sequence! 

12.37 (H,S) It is straightforward to compute the following 2x2 and 3x3 

determinants: 





3 i 0 

0 l 

1 1 

= 4 

and 

i 1 i 

0 i 1 
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where i = ■j~l. In fact, even for the lxl determinant having this 
same form, we get |3| = 3; so all three of these determinants are Lucas 
numbers. 

Prove that, for all n, the following n x n determinant is the (n + l)st 
Lucas number: 


3/000 
/' 1 / 0 0 
0 / 1/0 
0 0 / 1 / 
0 0 0 / 1 


0 

0 

0 

0 

0 


= L n+ i. 


0 0 0 0 0 0 ... 1 /' 

0 0 0 0 0 0 .../' 1 

12.38 (H,S) Use Theorem 12.7 to show that 17 cannot divide any odd 

Fibonacci number. 

12.39 Generating functions can be used to prove various identities. In this 
problem we will use generating functions to prove yet again the 
identity 

Fo + F\ + F 2 + F 3 + ■ ■ ■ + F„ — F n+ 2 - 1 . 


The idea is to find a generating function for each side of this identity, 
and then show that the two generating functions are the same. 

We do the left side first. Note that if we take the geometric series 
= l+ x + x 2 + x 3 + -- - and multiply it by the power series for the 
Fibonacci sequence f(x) = Fo + F\X + F 2 X 2 + F 3 x 3 + • • • , we get 

(1 + x + x 2 + x 3 + ■ ■ ■ ) (F 0 + Fix + F 2 x 2 + F 3 x 3 + ■ ■ ■ ) 


— Fq + (Fo + -Fi)* + (Fo + F\ + F2 )x 2 + (Fo + F\ + F 2 + F 3 )x 3 + ■ ■ ■ . 


Therefore, the generating function for the sequence on the left side of 
the identity is . 

Now, the generating function for the sequence on the right side is 
given by 

(F 2 - 1) + (F-i - l)x + (F 4 - l)x 2 + (F s - l)x 3 + ■■■ 

= (F 2 + F 3 x + F 4 x 2 + F s x 3 + •••)- (1 + x + x 2 + x 3 + ■ ■ • ) 

fix) - x 1 

X 2 1 — X 
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Finish this proof of the above Fibonacci identity by showing that 
the two generating functions ^ and ^ 2 — - ^ are the same, where 
fix) = is the generating function for the Fibonacci sequence. 

f2.40 (H) Who would ever guess that the number 89 can be used to produce 
the Fibonacci sequence? But just look at the first few digits of the 
decimal expansion of its reciprocal: ~ = .011 235 .... These are the 
first six Fibonacci numbers: 0, 1, 1, 2, 3, and 5. Unfortunately, this very 
nice pattern seems to stop at this point because if we compute a few 
more digits of ~ we get .011 235 955 056 . . . , and we can no longer 
see the Fibonacci sequence; however, it is still there, merely hidden: 

1 

— = .011 235 000 000 000 ... 

89 

+ .000 000 800 000 000 ... 

+ .000 000 130 000 000 ... 

+ .000 000 021 000 000 ... 

+ .000 000 003 400 000 ... 

+ .000 000 000 550 000 ... 

+ .000 000 000 089 000 ... 

+ .000 000 000 014 400 ... 

+ .000 000 000 002 330 ... 

+ .000 000 000 000 337 ... 


= .011 235 955 056 ... . 


Verify that the number 89 does in fact produce the Fibonacci 
numbers in this way; in other words, verify that 


1 

89 


E 

n= 1 


F n 

10 »+ 1 ' 
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Why do people do mathematics? Probably the most honest answer 
is that most mathematicians do mathematics because it gives them 
pleasure. In other words, they do it “because it’s fun.” However, it is 
also obvious that mathematics can be very useful in the real world. For 
example, the early Egyptians were called “rope-stretchers” because of 
their unsurpassed skill at surveying, using ropes with knots placed at 
equal lengths. Or think about the immense practical value that resulted 
from the work of Fibonacci as the Hindu-Arabic number system spread 
through Europe. 

But what can be the practical value of the number theory we have 
been studying in this book? One very good answer is that many other 
areas of mathematics that do have direct applications to the real world 
often, in turn, depend heavily upon important ideas from number 
theory. In other words, number theory is an essential strand in the 
larger fabric of mathematics, including applied mathematics. 

In this chapter, however, we will look at another answer to this 
question — a very surprising answer. In 1760, Euler proved one of his 
most celebrated theorems, Theorem 7.1 on page 192. It is fairly safe to 
say he proved this theorem mostly for the “fun of it." He might have 
also considered his theorem practical in the sense that it would be used 
to prove other results in number theory, but he certainly could not 
have expected it to have any applications to the real world. And, for 
more than two hundred years, no one else saw any practical use for this 
theorem either. 

Then, as we mentioned in Chapter 9, in 1977, Ronald L. Rivest, 
Adi Shamir, and Leonard Adleman, three computer scientists at the 
Massachusetts Institute of Technology, invented a stunning new system 
for sending secret messages, based on prime numbers and Theorem 7.1, 
that completely revolutionized the way in which these secret messages 
can be used. Before we look at these extremely important modern codes, 
now called the RSA encryption system after the three inventors, we begin 
this chapter by looking at a few examples of simple codes and ciphers 
that people have devised in the past. Our first two examples come 
unexpectedly from the slopes of Mount Everest. 
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Secret Codes on Mount Everest 

There is a long history of using runners to carry important messages, 
both secret and otherwise; just think of the legend of Pheidippides 
running twenty-six miles to Athens to report the Greek victory over the 
Persian forces at the Battle of Marathon. More recently, James Morris, 
a special correspondent for the Times of London, employed runners to 
carry coded messages in 1953 when he accompanied the British Mount 
Everest Expedition during its attempt to climb the highest mountain on 
earth. 

During the expedition Morris used Sherpa runners to carry on-site 
reports from the Base Camp at Everest to the nearest cable office 200 
miles away in Katmandu. However, since there were reporters from rival 
newspapers in the region, he also needed a way to ensure the secrecy 
of his final report announcing any successful ascent (or failed ascent). 
Therefore, he devised a code, replacing certain words or phrases with 
other words or phrases: for example, “snow conditions bad” would 
mean “message to begin”; “advanced base abandoned” would stand 
for “Hillary”; and “awaiting improvement” would be for “Tenzing.” So, 
when his coded message 

Snow conditions bad stop advanced base abandoned yesterday stop 

awaiting improvement . . . 

reached London safely, the Times was able to announce to the entire 
world that the summit of Everest had been reached on May 29 by 
Edmund Hillary and Tenzing Norgay. 

Coded messages would be used again on Everest during a 1999 ex- 
pedition whose goal was to solve the mystery of what really happened 
to George Mallory and his climbing partner when they disappeared 
from sight near the top of Mount Everest in June 1924. The climber 
who discovered Mallory’s body at about 27,000 feet on the north face of 
Everest radioed to his companions: “Last time I went bouldering in my 
hobnails, I fell off. Come on down. Let’s get together for Snickers and 
tea.” Knowing that there were other expeditions with radios spread all 
over the mountain, they had agreed beforehand on several codewords. 
“Boulder” was the code word for “body.” 

In the two examples just cited, entire words or phrases were assigned 
alternate meanings — this method of sending secret messages is usually 
called a code. This method has the advantage that coded messages can 
be disguised to look like ordinary messages. This worked well for Wilson 
but not so well for the team that found Mallory’s body. Others on the 
mountain that day quickly became suspicious when the team went 
into immediate radio silence after their discovery. Before they had even 
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returned to Base Camp that night the rest of the world had already 
received the news that Mallory’s body had been found. 

In the next section, we will discuss a very different strategy for send- 
ing secret messages. In a cipher, the content of a message is disguised, by 
assigning a new symbol for each individual character in the text of the 
message. 


Caesar and Vigenere Ciphers 

Julius Caesar used a cipher that simply shifted each letter in the alpha- 
bet three letters to the right. So, for example, using the English alphabet 
to illustrate this cipher, A would become D, K would become N, and, 
quite naturally, X, Y, and Z would become A, B, and C, respectively. 
Thus, for example, the message 

RETURNTOROME 


would be enciphered as 


UHWXUQWRURPH . 

This is an example of a shift cipher, sometimes called a Caesar cipher, 
in which each letter is shifted by a fixed number, k, of places in the 
alphabet. This is equivalent to representing each letter from A to Z by 
the “numbers” 00, 01, 02, . . . , 25 and the cipher merely replaces a given 
letter X by the letter X + k (mod 26). Then the process of deciphering a 
secret message is simply a matter of replacing each letter Y by Y - k. 

The word cryptography comes to us from two Greek words — kruptos 
meaning hidden and graphos meaning writing. Edgar Allan Poe, the 
famous American author who created the genres of horror and mystery 
fiction, was fascinated by cryptography, and he placed a cipher — the 
key to a buried treasure — at the very center of one of his most popular 
short stories, “The Gold Bug.” In this story he explains how one might 
begin to decipher an encrypted message: “Now, in English, the letter 
which most frequently occurs is e. Afterward, the succession runs thus: 
aoi dhnr stuyc f gl mwbk p q x z.” Then he spends several pages of 
the story in a detailed analysis showing exactly how to solve this “very 
simplest species of cryptograph.” 

In fact, any Caesar cipher, or, more generally, any cipher in which 
each letter is always replaced by the same symbol whenever it occurs — 
these are called monoalphabetic ciphers — can easily be cracked by the 
statistical approach advocated by Poe (though not using his data: while 
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e is far and away the most frequently used letter, the next most frequent 
are t,n,i,r, o, and a). 

Another factor that can come into play in deciphering secret mes- 
sages is that written languages have a certain amount of redundancy in 
that a few characters can be missing in a given text and the message is 
still completely understandable. For English the level of redundancy is 
about 50%. Anyone who has ever seen the television game show Wheel 
of Fortune should have little trouble “deciphering” the following famous 
phrase: 


_ o _ _ s e a se e e a _ s a o, 

even though at this point only four letters have been deciphered. 

A significant breakthrough in cryptography came in 1553 when 
Giovan Batista Belaso came up with a cipher that, while very similar 
to a simple monoalphabetic cipher, seemed to defy ordinary frequency 
analysis. A French cryptographer, Blaise de Vigenere, published a ver- 
sion of this cipher in 1586 and it is now known as the Vigenere cipher, 
although it has been rediscovered many times (see Problem 13.1). 

The Vigenere cipher is an example of a polyalphabetic cipher because, 
during the encryption, a letter does not always get sent to the same 
symbol each time it occurs. Here is how it works. The fundamental idea 
is that it uses a secret key, that is, a word or phrase. This secret key is 
known to both the sender and the receiver of the message. As before, the 
letters from A to Z are represented by the “numbers” 00, 01, 02, ... , 25. 
In this example, we will use as the key word raven, which gets expressed 
numerically as 17 00 21 04 13. 

Now, suppose that the message we wish to send is 

ONCE UPON A MIDNIGHT DREARY. 

We first write this message in digits with the secret key word written 
numerically in the row below it as many times as required: 

14 13 02 04 20 15 14 13 00 12 08 03 13 08 06 07 19 03 17 04 00 17 24 

17 00 21 04 13 17 00 21 04 13 17 00 21 04 13 17 00 21 04 13 17 00 21 . 
Then we add these numbers modulo 26 to get 


05 13 23 08 07 06 14 08 04 25 25 03 08 12 19 24 19 24 21 17 17 17 19 
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and rewrite this enciphered numeric message in term of letters as 

FNXIHGOIEZZDIMTYTYVRRRT. 

Note that this polyalphabetic cipher has changed the two is in the word 
midnight to different letters (z and m), and similarly, in the enciphered 
message the three Rs near the end all correspond to different letters in 
the original message (e, a, and R). In order to decipher an encrypted 
message the same process is used in reverse, merely subtracting the 
secret key modulo 26 instead of adding. 


Unbreakable Ciphers 

As late as 1917 Scientific American claimed that Vigenere ciphers were 
impossible to break; yet, just like Caesar ciphers, they too are completely 
susceptible to statistical analysis (see Problem 13.2). In a Caesar cipher 
each character in the message is shifted by the same amount. In a 
Vigenere cipher, while characters are shifted by different amounts, the 
shifts still present a repetitive pattern, and it is this repetition that 
makes the Vigenere cipher susceptible to statistical analysis. Nonethe- 
less, the fundamental principles underlying Caesar and Vigenere ci- 
phers did ultimately become the basis for truly unbreakable ciphers 
that were routinely used during the twentieth century by military and 
government officials prior to the modern computer era. 

This vulnerability of repetition can be avoided by the use of what is 
called a one-time pad — that is, a secret key that again is known to both 
the sender and the receiver and works in almost the same way as in the 
Vigenere cipher. There are three differences in this case: the secret key 
is precisely the same length as the message to be sent; the secret key is 
completely random; and, as its name suggests, a one-time pad is used 
only once (in fact, actual pads of paper were used, the top sheet being 
destroyed after it was used). 

The reason a one-time pad cipher is unbreakable is that as long as 
the secret key is truly random, the encrypted message contains no 
information about the original message, except its length. For example, 
if the encrypted message happens to be egglrwlghrefrsiekedz, then 
there is no way to decide which of the following messages is more 
likely to be the original message: missilearrivedsafely or shipment- 
delayedagain; moreover, within any given context, there will be a very 
large number of equally likely possible original messages. 

Such a one-time pad encryption system was used to secure the hot- 
line between Moscow and Washington D.C. in the 1960s following the 
Cuban missile crisis. The secret keys that were used for this encryption 
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system were physically delivered to the respective embassies of the two 
countries. Earlier, in 1957, the Russian agent Rudolf Abel was arrested in 
New York City— he was later convicted — and had on him at the time a 
one-time pad about the size of a postage stamp. 

The great disadvantage of a one-time pad encryption system, how- 
ever, is that a new random key has to be created, delivered, and de- 
stroyed for every single message that is sent (see Problem 13.3). This is 
why this method has been used primarily only for messages that were 
felt to be extremely important. 

Once the computer age arrived, the nature of cryptography changed 
quickly. In particular, there was a rapidly growing need for security 
in areas other than espionage and the military. In 1976, the Data 
Encryption Standard (DES) was adopted by the National Bureau of 
Standards as the official information processing cipher for the United 
States. DES was developed by a team at IBM and is based on a 64-bit 
binary block, only 56 bits of which are actually used for encryption (see 
Problem 13.4). Although DES has been widely used in applications both 
in the United States and internationally, it has now been replaced by the 
Advanced Encryption Standard (AES), which uses key sizes of 128, 192, 
and 256 bits. 


Public-Key Systems 

It seems self-evident that if an encrypted message is to remain secret, 
then only the sender of the message and the person receiving the 
message should have access to the specific secret key by which the 
message was encrypted in the first place. In other words, for example, it 
is obvious that in the one-time pad encryption system only the sender 
and receiver should have access to the one-time pad, or in our simple 
Vigenere cipher example above, only the sender and receiver should 
know the secret word raven. 

However, what seems self-evident is not always true. In 1975, three 
people, two electrical engineers at Stanford University, Whitfield Diffie 
and Martin Heilman, and an undergraduate at the University of Califor- 
nia, Berkeley, Ralph Merkle, created an entirely new kind of cipher that 
makes it possible to send secure messages over and over again even if 
the exact method of encrypting these messages should become known 
publicly. 

Then, in 1977, Rivest, Shamir, and Adleman used this same idea to 
develop a specific system based on prime numbers, the revolutionary 
encryption system we now call the RSA system, which is used widely 
today to ensure the security of every conceivable form of electronic 
communication. 
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The basic idea behind this revolutionary new kind of cipher is sur- 
prisingly simple. The reason that the secret key in the Vigenere cipher 
needs to be secret is that anyone who knows the specific arithmetic 
that was used to encipher a given message can simply reverse that 
arithmetic to decipher the message (in this case, because addition is 
used for enciphering, the inverse process, subtraction, can be used for 
deciphering). The idea behind the new cipher systems is to encipher 
messages using a relatively straightforward arithmetic operation that 
can be known to the public, but is such that deciphering requires an 
inverse arithmetic operation that is so complicated computationally 
that it is impossible to do — impossible, that is, unless you are supposed 
to be able to decipher the message, in which case you will have access to 
a vital piece of extra information that allows you to do the computation. 

The way in which the RSA system implements this basic idea is that 
a message M gets encrypted using two numbers, n and a, which can be 
made public and which can be used for all messages. And it can also be 
known that the operation that will be used to produce the encrypted 
message C is just C = M a (mod n). Now, in order to decipher the 
message C, all that needs to be done is to perform an operation C b = M 
(mod n) (why this is an inverse operation will be explained shortly), 
using a number n where n = pq is a product of two primes p and q. The 
catch is that in order to perform this operation you need to know what 
the number b is. The clever part of this system, then, is that the value 
of b can be hidden from anyone who should not be able to decipher the 
message. How is this done? 

The answer is that if n is a relatively small number, then the value of 
b can’t be hidden because anyone could figure out what b is, but if n is 
sufficiently large, then it becomes impossible to do the computations 
required to find b in anything close to a reasonable amount of time, 
even using the fastest computers. 

Let’s look at the details of the RSA encryption system to see how 
Rivest, Shamir, and Adleman used prime numbers and Theorem 7.1 
(Euler’s famous theorem of 1760) to make this work. Remember that the 
two numbers n and a, as well as the exponential enciphering operation 
C = M a (mod n), are to be public knowledge. We begin by letting 
n = pq, where p and q are prime numbers. (This is the heart of the 
matter: the value of n is public information, hut the values of p and q 
are secret.) 

Next, we choose a value for our exponent a by letting a be a positive 
integer relatively prime to 0 (n). Recall from Chapter 7 that <p(n) = 
4>(pq ) = 4 ')(p)4>(q ) = (p - 1 ){q - 1 ), so in practice a is chosen to be 
relatively prime to p - 1 and to <7 — 1, and hence to <j>{n). This could 
be done by first finding the prime factorization of these two numbers, 
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or, more simply, by letting a be a prime larger than these two numbers. 
(Again, the value of a is public information, but exactly how a was 
chosen is not.) 

To encipher a message we first turn it into a number M by represent- 
ing the letters from A to Z by 00, 01, 02, ... , 25. Since the enciphering 
operation will be done modulo n, we require that Men. (For a longer 
message than this, we must break the message into several shorter 
submessages, and encipher each one separately.) Then, the encrypted 
message C is 


C = M a (mod n ). 

How do we reverse this operation and decipher this message? Well, 
just like for a function such as x 3 where the inverse function is the cube 
root function xi — and this is true because (x 3 )i = x 1 = x, in other 
words, when we apply the original function to x and then apply the 
inverse function to the result, we get x back again — what we need is to 
find a number b such that 

C b = (M a ) b = M ab = M (mod n). 

In other words, the deciphering operation C b needs to behave like an 
uth root of the enciphering operation M a . This is where 0(h) comes in. 

We let b be the inverse of a modulo 0(h)— that is, by Theorem 6.1, b is 
the unique solution to the congruence ax = 1 (mod 0(h)). Therefore, 

ab = 1 (mod 0 (h)). 

In practice, we can find b by using the Euclidean algorithm to express 
1 as a linear combination of a and 0(h); thus, once we have 1 expressed 
as 1 = ax + 0(h) v, then the number x is the value of b we are looking 
for. Note that actually finding b this way requires us to know the value 
of 0(h). 

At this point we impose a temporary restriction on the message M 
and require that M be relatively prime to n. This allows us to use Euler’s 
theorem, Theorem 7.1, to conclude that = 1 (mod n). Now we are 
ready. Since ab = 1 (mod 0(h)), we can write ab — 1 + kcp(n), for some 
integer k. Therefore, by Euler’s theorem, 

C b =(M a ) b =M ab =M l+k4,(n) =M(M ,l>{ll) ) k =M(l) k =M (mod n), 


exactly as desired. 

It is very unlikely that M and n will fail to be relatively prime, 
especially in practice where p and q are chosen to be enormous primes. 
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Nonetheless, in Problem 13.9, you will be asked to show that if by 
chance either p or q divides M (they can’t both divide M since M < n), 
then it is still the case that C b = M (mod ri). 

We now see why messages encrypted by the RSA system can be 
made secure. As we saw in Chapter 9, it is currently just barely possible 
to factor a 200-digit number that is itself a product of two 100-digit 
primes. However, the fact that the 768-bit (232-digit) number RSA- 768 
was factored in 2009 suggests that within the next few years using 
even 1024-bit (308-digit) numbers for RSA encryption will no longer be 
considered secure, even though the task of factoring numbers of this 
size is about a thousand times harder than factoring RSA- 768. 

Before we look at a specific example, let us mention an important fea- 
ture of this public-key encryption system that contributes significantly 
to it being so widely used. Imagine a company — this might be a business 
selling a product online, or a bank making daily transfers of funds elec- 
tronically, or a government agency needing to handle frequent highly 
sensitive correspondence — that wants to be able both to send secure 
messages within the company and to receive secure messages from 
people outside the company residing anywhere in the world. Because 
the two numbers n and a, the public key, are available to anyone in the 
world, it is possible for people, whether they are in the company or not, 
to send encrypted messages to the company. However, the company 
maintains complete control of who can decipher these messages by 
determining who in the company has the extra information as to the 
actual values of p and q — and hence the value of <j>(ri) since </>(«) = 
(p- l)(q-l). Moreover, the public key, ( n , a), never needs to be changed, 
unless there is some indication that the values of p and q are no longer 
secret. 

The feature that a company can receive encrypted messages from 
outside the company has an added bonus that is of enormous ben- 
efit. Imagine two banks, Bank A and Bank B, and that you have just 
instructed Bank A to make a very large electronic transfer from your 
account in Bank A to another account you have at Bank B in the 
Cayman Islands. When Bank B receives this electronic message, before it 
deposits this money in your account, it would very much like to be sure 
that the message actually came from Bank A. In other words, it would 
like what is called a digital signature. Here is how Bank A can “sign” the 
message. 

Suppose that both banks use RSA public key encryption systems and 
that the public key for Bank A is (n, a). Bank B also has its own public 
key, but we don't need to know what it is for this discussion. Let M be 
the message containing the instructions concerning the transfer of your 
funds. Bank A enciphers message M using the public key for Bank B, and 
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sends this encrypted message to Bank B, which Bank B deciphers using 
its own deciphering process. At this point, Bank B has received message 
M, but still needs to be sure the message actually came from Bank A. 

To sign this first message, Bank A applies its own deciphering process 
to the message M and produces a second message M b . Then Bank A 
enciphers M b using the public key for Bank B, and sends this second 
encrypted message to Bank B. When Bank B receives this (doubly en- 
crypted) message, it first deciphers it using its own deciphering process, 
which produces M b . Next, it can use the public key, (n. a), of Bank A 
to produce ( M b ) a = M ab = M (mod n). At this stage, Bank B compares 
this second message with the message it previously received. If the two 
messages are identical, these messages had to have come from Bank A 
because only Bank A could have applied the deciphering operation M b 
in the first place. 

We will now illustrate the RSA encryption system with an example 
using extremely small values of p and q for the sake of simplicity. So, let 
the public key be (3127, 61), where n = 3127 is the product of the primes 
p — 59 and q = 53, and where the exponent a — 61 was chosen to be 
a prime larger than both p and q to ensure that a would be relatively 
prime to <p(n) = (p — l)(q - 1) = 58 • 52 = 3016. (Note, however, that 
in this case since 3016 = 2? ■ 13 • 29 we had many other choices for the 
exponent a , for example, 3.) 

The message we wish to encipher is 

QUOTH THE RAVEN NEVERMORE, 

and so we as usual turn this into a number by representing the letters 
from A to Z by 00, 01, 02, ... , 25; and, since n = 3127, we in this case 
subdivide our message into eleven submessages of length four, giving us 

1620 1419 0719 0704 1700 2104 1313 0421 0417 1214 1704. 

This is the message M, which is in reality eleven messages M\, ... . M n . 
We encipher M by using the exponent a = 61 to get the encrypted 
message C = M 61 (mod 3127), which again in reality is eleven distinct 

encrypted messages Mf 1 MfJ (mod 3127). Thus the encrypted 

message C becomes 

3104 0145 1567 1765 1914 2561 1841 2577 2660 1780 2489. 

Up to this point anyone could have enciphered this message because 
n = 3127 and a — 61 are public knowledge. Now, however, to decipher 
the message C we need to find the number b that is the inverse of 61, 
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modulo 3016. But the number <f>(ri) = 3016 is not public knowledge; the 
only reason we know about 3016 is because we know that p = 59 and 
q = 53. (Of course, in this simple example, anyone could find 0(3127) 
by merely counting all of the numbers from 1 to 3127 that are relatively 
prime to 3127, but we are pretending that is not a realistic option.) 

We proceed to find b using the Euclidean algorithm. Recall that since 
61 and 3016 are relatively prime, we can use the Euclidean algorithm 
to express the gcd, 1, of these two numbers as a linear combination of 
them. When we do this we get 

1 = 445 ■ 61 - 9 • 3016, 

which means that b = 445. This allows us to decipher message C 
by computing C 445 = M (mod 3127). In reality, again, this is eleven 
distinct computations: Cf 45 , . . . , C 445 (mod 3127). 

It should be clear that these encryption systems have far more 
flexibility than we have indicated. For the sake of simplicity we have 
restricted our character set to the alphabet A through Z. But there is 
no reason not to allow for additional characters — punctuation, spaces, 
upper- and lowercase letters, and so on. This, for example, would allow 
us easily to accurately quote, and thus encipher should we wish, Poe's 
famous line: Quoth the Raven, “Nevermore.” 


Problems 

13.1 * Charles Dodgson, better known as Lewis Carroll, invented several 
secret codes and in so doing rediscovered the Vigenere cipher (his 
version used a very handy 5x7 card with a 26 x 26 grid of letters on 
one side and instructions on the other side). 

(a) Using the secret key vigilance, send the following message 
using the Vigenere cipher (and rediscovered by Charles 
Dodgson) : 

MEET ME ON TUESDAY EVENING AT SEVEN. 

(b) Using ALICE as the key, decipher the message : 

THIUFRTTNMGLVFXHPANMTSGVSVPA 

(you might recognize this phrase from Through the Looking 
Glass). 
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13.2 (H) Explain why Vigenere ciphers, like Caesar ciphers and 

monoalphabetic ciphers, are still susceptible to statistical analysis. 

13.3 In the one-time pad encryption system it is essential that the 
one-time pads be randomly generated. With a friend, devise a cipher 
that allows you to represent and encipher the twenty-six letters of the 
alphabet as well as the ten digits 0 through 9 and several common 
punctuation marks. Next devise a way to generate random one-time 
pads by hand (the point here is to maintain a spirit of pre-computer 
age cryptography). Then work out a way to transfer, and subsequently 
destroy, the secret keys. Finally, send a few secret messages. 

13.4 ★ Modern cipher systems such as DES and AES represent messages in 

binary since that is convenient for computers. Thus, for example, DES 
is known as a block cipher because it manipulates data in 64-bit blocks 
of Os and Is. However, only 56 of these bits carry information; the 
other 8 bits are used only as “parity checks” (this is much like the 
“check digit” that occurs as the last digit of the twelve-digit bar code 
that appears on almost all products sold these days, scanned in binary 
as you may have noticed, in black and white stripes, black for 1 and 
white for 0 — see Problem 13.5). AES has block sizes of 128 bits. 

In this problem we will discover one of the convenient features of 
these modern cipher systems. We represent letters as strings of Os and 
Is and, in this case, since there are twenty-six letters, it will be 
sufficient for our purposes to use strings of length 5. Thus we let 

A = 00001, B = 00010, C = 00011, D = 00100, ... , Z = 11010, 


so the message meet at noon becomes 

onoi ooioi ooioi ioioo ooooi ioioo oino 01111 01111 01110. 

In order to encipher this message we use a key of the same length, 

01010 11111 00100 01100 11001 01010 10000 00010 11010 01100 , 

and add these two strings, bit by bit, modulo 2, and get 

00111 11010 00001 11000 11000 11110 11110 01101 10101 00010 . 

(a) Decipher this encrypted message by subtracting, bit by bit, the 
key from the encrypted message (note that, modulo 2, 

0-1 = 1 ). 
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(b) Show that you can also decipher this encrypted message by 
adding the key, bit by bit, to the encrypted message. 

This is an example of what is called a symmetric-key encryption 
system because the same operation is used for enciphering and for 
deciphering. 

(c) Find a Caesar cipher that is a symmetric-key cipher. 

13.5 Whenever information is being transmitted numerically it is a good 
idea to include some measure of redundancy to help ensure that 
information is also being transmitted reliably. Electronic 
communications involving credit card numbers, bank identification 
numbers, and bar codes on products are all good examples of this. The 
Universal Product Code, or bar code, which you find on almost every 
product you purchase, consists of vertical black and white stripes 
representing, respectively, Is and Os. Each product sold has a 
twelve-digit number attached to it, and the last digit is a check digit, 
whose only purpose is to ensure, as far as is practical, that no error has 
occurred in recording any previous digit. 

Find a recent product that you have purchased and locate the 
tewlve-digit product code. The twelfth digit is the check digit. First, 
add the odd digits — that is, the odd digits 1 through 11 — then 
multiply this sum by 3. Next, add the even digits 2 through 10 
(stopping, that is, before the check digit). Then, add these two sums, 
reduce this final sum modulo 10, and subtract the result from 10. Is 
this number equal to the check digit? 

13.6 ★ (S) Use the RSA encryption system to encipher the message 

LENORE 

using the public key («, a) = (2701, 7). 

13.7 * (S) You have recently been hired by a major perfume company, and a 

large portion of your job involves industrial espionage. A rival firm is 
about to launch a new perfume, whose name is an extremely closely 
guarded secret worth hundreds of thousands of dollars. You not only 
managed to obtain an encrypted message containing this name, but 
an inside source also provided you with the critical prime values p 
and q used by the rival company to decipher its messages. The public 
key of the rival firm is (», a) — (2881 ,59); the encrypted version of the 
name of the perfume you obtained is 


2647 0740 2441 2070 1485; 
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and the primes that your (well-paid) source delivered to you are 
p — 67 and q = 43. What is the name of the perfume? 

13.8 (S) As it happens, your long-term boyfriend works for a major 

advertising agency and his most important client is the rival perfume 
company mentioned in Problem 13.7. Naturally, this company needs 
to share the name of its new perfume with your boyfriend so that he 
can plan the entire marketing campaign surrounding the launch of 
this new product. 

So, the company sends him the following encrypted message: 

0309 1424 0442 1714 0451 

using the public key (3127, 61) of his ad agency in order to provide 
him with the name of the perfume. It also sends him a second 
message: 


0239 2471 2228 2629 0398 

to sign its first message so that he can really be sure the message came 
from the company. 

Describe in detail exactly what your boyfriend does to learn the 
name of the perfume and verify the signature. By the way, this 
perfume company clearly trusts your boyfriend to run this important 
ad campaign but it is not about to trust him with any information 
about its own encryption system other than what is public to 
everyone: that its public key is (n, a) = (2881, 59). On the other hand, 
within his own agency he is one of the few employees who know that 
b = 445 is the exponent needed to decifer any messages that have 
been encrypted using its own public key (3127, 61). Moreover, he also 
knows how to use Sage. 

13.9 (H,S) Recall that the stated purpose of this chapter — its raison d'etre, if you 
will — was to make the point that it took more than 200 years before 
anyone found a real-world application for Euler’s theorem. In order to 
apply Euler’s theorem in our explanation of the RSA encryption 
system, it was necessary to assume that the message M was relatively 
prime to the integer n that was being used to encipher M. 

This was a reasonable assumption, since in actual practice, where p 
and q are chosen to be enormous primes, there is an extremely remote 
chance that either of these primes will divide M, and therefore M and 
n = pq are highly likely to be relatively prime. Nonetheless, just to be 
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absolutely safe, we should consider the remote possibility that p or q 
divides M. 

So, in this problem, assume that for a public key (n, a) we have a 
message M < n such that M and n are not relatively prime. Prove that, 
even in this case, for the encrypted message C, and for the integer b 
that is the inverse of a modulo 0(«), it is still true that C b = M 
(mod n). 


14 

Continued Fractions 


The Golden Ratio Revisited 

In Chapter 12, on Fibonacci numbers, we repeatedly came across the 
number a = the golden ratio, which is the positive solution to 
the quadratic equation x 2 — x — 1 = 0. 

If we take this equation x 2 — x - 1 =0, and rewrite it as x 2 = x + 1 and 
divide by x, we get 


x=l+- . 
x 


Since x is also now on the right side of this equation, we can replace x 
by 1 + \ in the right side of this equation to get 


* = 1 + 


1 


1 

1 + - 
x 


Continuing in this fashion, we conclude that 


*= 1 +-= 1 + 

x 


1 


1 

1 + - 
X 


= 1 + 


= 1 


1 + 


1 + 


1 + 


1 


1 

1 + - 
x 


Since x = a = is a solution to each of these equations, this leads 
us, quite naturally, to the following representation of the golden ratio 
a = that we first saw in Problem 3.36 as an infinite continued 
fraction: 


i + Vs 


1 + 


1 + 


1 + 


1 + 


a = 


2 


1 
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We also saw in Chapter 3 that a very old idea inherent in the Euclid- 
in algorithm allows us i 
finite continued fraction: 


ean algorithm allows us to write any rational number such as ~ as a 


72 

30 


= 2 - 


1 


1 

2+ 2 


Around A.D. 500, in India, Aryabhata used continued fractions in a 
similar way to solve linear equations. 

In Chapter 3, we wrote the following square roots as repeating infinite 
continued fractions (see Problem 3.34): 


73 = 1 


1 + 


1 + 


sfi = 2 + 


1 


1 


1 + 


1 + 


In 1572, the Italian mathematician Raphael Bombelli represented 
square roots as continued fractions in his V Algebra Opera — a book 
that was quite influential in its day, reflecting as it did the state of 
sixteenth-century mathematics, including the first treatment of com- 
plex numbers, as well as bringing about a more general awareness 
of the work of Diophantus. And, in Chapter 4, we saw that contin- 
ued fractions could be used to provide solutions to Pell’s equation 
x 2 - ny 2 = 1 . 

Although it was John Wallis who in 1653 introduced the term 
“continued fraction,” it was Leonhard Euler who first investigated 
these expressions in a systematic way. Euler discovered a truly 
beautiful continued fraction representation for the irrational 
number e: 


1 


2 + 


3 + 


5 -f 


c — 2 + 
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By now you have perhaps been convinced that continued fractions 
seem to be interesting, and possibly even useful; however, what an 
infinite continued fraction really is has not yet been made at all clear. 
Up to now we have been dealing with these expressions rather casually. 
In this chapter we begin a much more serious and systematic treatment 
of continued fractions. 


Finite Continued Fractions 

The notation we have been using thus far for continued fractions is 
certainly visually attractive as well as historically appropriate, but it is 
also rather inefficient and wasteful of space on the printed page. So, 
with some regret, we will now often make use of a much more compact 
notation— introduced by Dirichlet in 1854— with which we represent 
the finite continued fraction 


<J l H 

qz H 

+ 


1 

1 

1 


1 


1 

‘In - 1 + — 

q>i 


by the expression [q^, q 2 , qj, q„]. 

Thus we represent the finite continued fraction given above for || as 


1 


more compactly now by [2, 2, 2], (Incidentally, another notation 
is sometimes used where, for example, the continued fraction 
[1 . 2, 3, 4, 5] would be written as l+^y however, we will not use 

this particular notation.) 

This notation may feel strange at first, but it is every bit as useful to 
write || = [2. 2, 2] using this notation as it is to write || = 2.4 using 
the more familiar decimal notation. Note that by using this notation 
we are now only going to consider continued fractions where all of the 
numerators are equal to 1; moreover, in this chapter, we will almost 
always only consider continued fractions where each q t is an integer 
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and qt is positive for each / > 1; however, q\ may be positive, negative, 
or 0. These continued fractions are called simple continued fractions to 
distinguish them from more general continued fractions such as the 
one mentioned above discovered by Euler for the irrational number 
e. Nonetheless, throughout this chapter, we will often use the term 
continued fraction to indicate a simple continued fraction. On rare 
occasions where we relax the restriction that each q, be an integer, we 
will be clear about it. 

Since we are requiring that q u . . . ,q„ be integers, it is quite obvious 
that the finite continued fraction [<ji> <? 2 . < 73 , • • ■ < <?«] represents a ratio- 
nal number (see Problem 14.1). 

Conversely, because of the Euclidean algorithm, it is also true that 
any rational number can be expressed as a finite continued fraction. In 
order to illustrate this idea we recall from Chapter 3 the steps of the 
Euclidean algorithm that produced the gcd of the numbers 2001 and 
1984: 


2001 = 1 ■ 1984 + 17, 
1984 = 116 ■ 17 + 12, 
17 = 1 • 12 + 5, 

12 = 2-5+2, 

5 = 2 • 2 + 1, 


2 = 2 - 1 + 0 . 


In this case, the quotients produced in each line (using the division 
algorithm, Theorem 3.1) are, in order, q\ = l, q 2 — 116, q 2 = 1, 
q 4 = 2, qs = 2, and qe = 2. It follows that = [1, 116, 1, 2, 2, 2], 
in other words, 


2001 

1984 


= 1 + 


1 

~ T 


i + 


i 


2 + 


1 


2 + 


1 

2 


116 + 
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To see exactly why this works out so perfectly we rewrite each line, 
except the last, of the Euclidean algorithm as follows: 


2001 

1984 

1984 

17 

17 

12 

12 

~5~ 

5 

2 


1 + 

116 


17 

1984’ 

12 

+ 


1 + 

2 + 


5 

12 ’ 

2 

5’ 


2 + 


1 

2 ' 


Now, we begin at the top with = 1 + and replace each fraction 
as we come to it by its reciprocal in the next line. Here are the details: 


2001 17 

1984 “ + 1984 


+ 


1984 

"IT 


= 1 + 


1 


12 

116 +17 


1 + 


1 


1 

116 + — 
17 

12 


= 1 + 


116 


1 


-1 + 


5 

1+ 12 


= 1 + 


116 + 


116 + 


1 + 


12 


1 + 


2 + 


= 1 + 


1 


1 

116 H 

1 

1 + 

1 

2 + - 
5 

2 


1 + 


116 


1 + 


2 + 


2 + 


Since any rational number can be handled in exactly this same way us- 
ing the Euclidean algorithm (see Problem 14.2), we have the following 
fundamental result. 
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Theorem 14.1. Any rational number can be expressed as a finite simple 
continued fraction , and conversely. 


It turns out that each rational number has two distinct representa- 
tions as a continued fraction. For example, we have represented || as 
[2, 2, 2], but it can also be represented as [2, 2, 1, 1], This is not a serious 
difficulty — it can sometimes even be useful — and is no more problem- 
atic than the mildly disturbing fact that in the decimal system a number 
such as 1 has two different representations, namely as 1.000 . . . and as 
.999 .... 

In our standard notation for continued fractions the individual num- 
bers q\,q 2 , q 3 , , q„ are called the partial quotients, and the terms 


C\ = [q\] , C2 = [qi, q2\ , c 3 = [qi, q2, q?f\ , c 4 = [q\. q2. t?3. q 4 \ . ■ ■ • 

are called the convergents for the continued fraction. 

Thus, for the continued fraction = [1, 116, 1,2. 2, 2], we have 
the following convergents: 

ci = [!] = ! = i 


c 2 = [1, 116] = 1 + -L = ~ « 1.008 620, 
116 116 


c 3 = [1, 116, 1] = 1 


116 + 


118 


1 117 


1.008 547, 


c 4 = [1, 116, 1, 2] = 1 + 


1 


116 + 



353 

350 


1.008 571, 


c 5 = [1, 116, 1, 2, 2] = 1 + 


116 


1 + 


1 


1 

2+ 2 


824 

817 


1.008 567, 
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c 6 = [1. 116. 1, 2, 2, 2] = 1- 


116 


2001 

1984 


1.008 568. 


1 + 


2 + 


1 


1 

2+ 2 


Of course, the last convergent c 6 is actually equal to the fraction 
and therefore the convergents do “converge” to this number; that is, 
they get closer and closer to the original fraction, and for this finite 
continued fraction they in fact actually reach this number. 

Note that the convergents C \ , c 2 , c 3 , c 4 , c 5 are alternately less than or 
greater than the fraction and, in particular, we have 


2001 

Ci < c 3 < c s < = c 6 <c 4 < c 2 . 

In other words, as the convergents converge to this fraction they do so 
in a very peculiar way: the odd convergents converge from below while 
the even convergents converge from above! 

Computing the values of a sequence of convergents such as 

_ 1 _ 117 118 353 824 2001 

Cl ~ r C2 ~ 116’ C3 - 117’ ° 4 _ 350’ Cs _ 817’ C6 ~ 1984 

as we did in this example is certainly tedious, to say the least. It turns 
out that there is a much easier way to compute convergents, recur- 
sively, without having to actually compute each one individually from 
scratch. 

In this example, look at the convergent c 3 = ^. Note that the 
numerator 118 is the sum of the numerators of the two previous con- 
vergents (that is, 118 = 117 + 1); and that the denominator 117 is 
the sum of the denominators of the two previous convergents (that is, 
117 = 116 + 1). 

What about finding the next convergent c 4 = ||| ? Its numerator 
353 can be found by multiplying the numerator of c 3 by 2 (we mul- 
tiply by 2 because q 4 = 2) and adding the numerator of c 2 , that is, 
353 = 2 • 118 + 117; and its denominator 350 can also be found in the 
same way by multiplying the denominator of c 3 by 2 and adding the 
denominator of c 2 , that is, 350 = 2 • 117 + 116. 

We can compute Cs in exactly the same way (again, we multiply 
by 2 here because q s — 2). Its numerator is 2 • 353 + 118 = 824, 
and its denominator is 2 • 350 + 117 = 817. Then, for c 6 we compute 
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2 • 824 + 353 = 2001 for the numerator and 2 • 817 + 350 = 1984 for the 
denominator. 

So, the pattern in each of these cases seems to be that in order 
to compute the numerator (or denominator) of a convergent q for a 

continued fraction [q 4 , q 2 q„] you simply multiply the numerator 

(or denominator) of convergent <+_i by qk and add the numerator (or 
denominator) of c *_ 2 . This leads us to our next theorem. 

We will use induction to prove this theorem. Since one step in the 
proof is slightly subtle it will be a good idea to illustrate this particular 
step before actually using it in the proof itself. So, as an example, 
consider the convergent c 4 = computed above. 

We have already observed that c 4 = But the convergent c 4 

represents the continued fraction 


1 

116 + 

1 


which can be turned into the continued fraction represented by the 
convergent c$ by replacing the 2 in the denominator of the last fraction 
by the term 2+ \ : 


116 H — 

1 

1 + 

1 


Therefore, we can also find c$ by replacing each 2 in the expression 
c 4= lllfTTi b y 2 +2 to § et 

(2+ \) ■ 118+ 117 5 118 + 2- 117 824 

CS ~~ (2 +1). 117+116 ~ 5 • 117 + 2 ■ 116 ~ 817 ' 

exactly as we expect! 

The reason this works (and the only reason this step is slightly subtle) 
is that in both cases where we replace 2 by something, the 2 is really 
the partial quotient q 4 and we are replacing all occurrences of q 4 , and 
leaving everything else unchanged because each of the other numbers 
depends only upon previous partial quotients q\, q 2 , q 3 . 
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Theorem 14.2. Let [q x , q 2 , ... , q n ] be a simple continued fraction with 

convergents c x ,c 2 c n . Further, for each convergent c k , write c k = ‘jf 

where a k is the numerator and b k is the denominator of the convergent (and, 
by definition, we set b\ = 1 ). 

Then, for each k>3, 


a k = q k ■ a k _\ + a k _ 2 and b k = q k ■ b k _ x + b k _ 2 . 


Proof 

Note that C\ = q\ = so a x — q\ and b\ = 1; and also note that 
C2 = <7t + J 2 = so «2 = <7 i <?2 + 1 and b 2 = q 2 . 

We will use induction. First we prove the statement for the value 
k = 3: 


r _ n , 1 „ , <?3 <7l<?2<73 + <?1 +‘73 

c 3 - H r = ?i H — r = ; 

<?2 T ~ q2d3 + 1 + 1 

_ <73 ^72 + 1) + qi _ q 2 ■ a 2 + a\ _ 

<73(72) + 1 q 2 -b 2 + bi’ 

that is, a 2 = q 3 -a 2 + a x and b 2 — q 3 -b 2 + b x , as desired, and the statement 
is true for k = 3. 

Now assume that the statement is true for some k > 3. Thus 

_ C} k ■ dk - 1 + . 

<7r • bk i + hjt-2 

and, since the convergent Cr + i can be found by replacing q k in the 
continued fraction [q\, q 2 , , q k ] by q k + ^,we get 

Ck i = (*?* + •«*- 1 + fl fc-2 _ (q k+ iq k + 1) • a*-i + q M a k -2 

( tfk + ^-j-) • hk-1 + b<:-2 (qk+\dk + 1) ' ^/c-l + <lk+\b k - 2 

_ c]k+\{qk ■ ak- 1 + Qfc— 2) + flit-i _ <lk+i(a k ) + a k -\ _ 
qk+iitfk ■ b k -\ + b k _ 2) + hr- 1 <?r+i(hr) + ht-i ' 

hence a^+i = q k + 1 ■ a* + «r-i and i>r +1 = q k+x ■ b k + hr-i, as desired, and 
the statement is true for all k > 3. 

This completes the proof of the theorem. ■ 

Theorem 14.2 tells us that in order to compute convergents we can 
use recurrence beginning with k = 3. However, if we define a 0 = 1, 
b 0 = 0 and a_ 1 = 0, h_i = 1, then recurrence also works for k = 2 and 
k = 1 since q 2 a x + Oq = g 2 <7i + 1 = a 2 and q 2 b x + bo - q 2 = b 2 , and 
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also q\Oo + c? i = q\ = a\, and q\b a + fo_i = 1 = b\. This convention 
is especially convenient if you compute convergents using a table, as 
follows: 


k 

-1 

0 

1 

2 

3 .. 

. k 

q k 



01 

02 

03 .. 

q k 

a k 

0 

1 

fll 

fl 2 

«3 

a k 

b k 

1 

0 

bi 

t>2 

b 3 .. 

.. bk 


In this format it is relatively easy to compute the convergents c k = f- 
one at a time. We begin by finding a\ which is q 4 ■ 1 + 0 = q 4 ; similarly, 
b\ is qi • 0 + 1 = 1 . Then, moving to the right one column in the table, 
we findfl 2 , which is ^2 <?i + 1; and we fmd/? 2 , which isg 2 • 1 +0 = q 3 . We 
continue in this way one column at a time. We illustrate this process 
in the table below where we are computing the convergents for the 
continued fraction [1,116,1,2,2,2] 


k 

-1 

0 

1 

2 

3 

4 

5 

6 

q k 



1 

116 

1 

© 

2 

2 

a k 

0 

1 

1 

117 

118 

? 



b k 

1 

0 

1 

116 

117 





and we are about to compute a 4 by multiplying the 2 (that is, q 4 ) by the 
doubly underlined 118 and adding the underlined 117 to get 2 • 118 + 
117 = 353. Then we would compute b 4 by multiplying this same 2 by 
117 and adding 116 to get 2 • 117 + 116 = 350. 

In Chapter 4, we noticed an interesting property of convergents, 
namely that if ^ and ^ are two consecutive convergents for a contin- 
ued fraction, then the cross-product of their numerators and denomi- 
nators is given by 


a k bk - 1 - b k a k -i — ( — l) fc . 

For example, the convergents for the continued fraction for are 
1 117 118 353 824 2001 


f 116' 117’ 350’ 817’ 1984’ 
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and for k — 4 we have 


a 4 b 3 - b 4 a 3 = 353 ■ 117 - 350 • 118 = 41 301 - 41 300 = 1 = (-1) 4 ; 


while for k — 5 we have 


a s b 4 - b s a 4 = 824 • 350 - 817 ■ 353 = 288 400 - 288 401 = -1 = (-1) 5 . 

This remarkable property is an easy consequence of Theorem 14.2. 

Theorem 14.3. 14.3 Let [q\. q 2 , . . . , q„] be a simple continued fraction 
with convergents Ci, c 2 , ... , c n . Further, for each convergent c k , write c k = jf 
where ak is the numerator and bk is the denominator of the convergent (and, 
by definition, we set b\ = 1 ). 

Then, for each k > l, 

a k b k -i - b k a k -i = (-1)*. 


Proof 

We use a proof by induction. Recall that a\ = q\,b\ — \,a 2 = q\q 2 + 1, 
and b 2 = q 2 - 

Therefore, for k = 2, we have a 2 bi - b 2 a\ = (q\q 2 + 1)1- q 2 q\ — 1 = 
(— l) 2 , and the statement is true fork = 2. 

Now assume that the statement is true for some k where 2 < k < n, 
that is, asuume that 


cikbk-i - b k a k _i = (- 1 )*. 


Then, by Theorem 14.2, 


ak+ih - bk+ia k = (qk+idk + )h - (qr+ibk + b k -i)ak 
= a k ~ i b k - b k -ia k = -(a k b k - 1 - b k a k -i) = -(-l) k = (-l) k+1 . 

This completes the proof of the theorem. ■ 

Alternate Proof of Theorem 14.3 

Recall that, for convenience, we set n 0 = 1, b 0 = 0andc?_i =0, = 1. 

Let 1 < k < n. Then, by Theorem 14.2, 


<*kbk - i - b k a k -i = (qkUk-i + ak- 2 )bk-\ - (q k bk - i + ^— 2)^—1 

= a k - 2 b k -\ - bk- 2 ak-\ = -(ak \bk- 2 - b k -iak- 2 ). 
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Since this formula holds for any k, we can now also write 
ak-\bk-2 - i>k- lflJc-2 = -(fl/t-2^-3 ~ bk-Z^k-i), 

and so 

Qkbk - 1 - bkClk - 1 = (-l) 2 (fljt-2^-3 - bk-2dk-'i)- 

Continuing in this way, we get 

Qkbk-\ — bkdk- 1 = ( — l ) 1 2 - bk—iUk—z) 

— ( — l) 2 3 - bk-2ak-3) 


= (-l^fflofc-l - M-l) 

= (— l) fc (l 1-00) 

= (-!)*• 

This completes the alternate proof of Theorem 14.3. ■ 

You may have noticed that convergents seem to always be in lowest 
terms. In other words, whenever we have computed a convergent c* = 
^ it has turned out that ak and b k are relatively prime. Note that this is 
true for a\ and £>i since a\ = q \ and b\ = 1, and for k > 1 this is true 
by Theorem 14.3, since if d is the greatest common divisor of a* and bk, 
then d also divides (-1)*, and so d — 1 . 

The fact that ak and bk are relatively prime was very useful in 
Chapter 4 where we saw how to use the formula in Theorem 14.3 to 
solve linear Diophantine equations, that is, equations of the form ax + 
by — c. Recall from Theorem 4.1 that if you can find one solution 
x = x\, y = yi for this equation, you can immediately write down all 
solutions. Furthermore, if you can find a solution x — jq , y = yi for the 
equation ax+by = l,then* = cx\, y = cy i is a solution for the equation 
ax+by = c. Therefore, to solve linear Diophantine equations in general, 
it is sufficient to focus on the equation ax + by = 1 . We also know from 
Chapter 4 that in order for the equation ax + by = 1 to have a solution 
it must be the case that a and b are relatively prime. 
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Let’s review how to solve linear Diophantine equations by finding 
a solution to the equation 31.x + 24 y = 1. Note that 31 and 24 are 
relatively prime. First, we represent the fraction ~ as a continued 
fraction: [1. 3, 2, 3]. Next, we use Theorem 14.2 to find the convergents: 
T’ f > f > 25 • Now, applying Theorem 14.3 to the last two convergents we 
see that 

31 ■ 7-24-9 = 9- l) 4 = 1. 

Therefore, we can immediately write down the solution x = 7, y = - 9 
for the equation 31x + 24y = 1. In Problem 14.11 you will be asked 
to use Theorem 14.3 to find a solution for the Diophantine equation 
900x + 628y = 400. 

We turn now to another interesting property about convergents 
that we have previously observed: if a number x is represented by a 
continued fraction, then the odd-numbered convergents “converge” 
to x from below and the even numbered convergents “converge” to x 
from above. In particular, we observed that this property holds for the 
continued fraction = [1 , 116, 1, 2, 2, 2] since C\ < c 3 < c 5 < = 

Cs < C4 < C 2 . 


Theorem 14.4. Let x = [q\, q 2 , ... . q„] be a simple continued fraction 
with convergents c k , c 2 , ... , c n . Then, the odd-numbered convergents form 
a strictly increasing sequence, and the even-numbered convergents form a 
strictly decreasing sequence; moreover, each odd-numbered convergent is less 
than every even-numbered convergent. In other words, 


C\ < C3 < Cs < ■ ■ ■ < C 6 < C\ < c 2 . 


In particular, if n is odd, then x = c n is less than every even numbered 
convergent; whereas, if n is even, then x = c n is greater than every odd- 
numbered convergent. 

Proof 

As usual, for each convergent c k , write c k — where a k is the numerator 
and b k is the denominator of the convergent. Recall also that a\ = q\, 
b\ = 1 , a 2 = q\q 2 + 1 , and b 2 = q 2 ■ Since b\ > 0 and b 2 > 0, it follows 
from Theorem 14.2 that b k > 0 for all k > 1. 

First, we observe that the convergents alternately increase and de- 
crease. By Theorem 14.3, we have a k b k _\ - b k a k _\ = (-1)* for any k > 1. 
Dividing by b k b k _ j , we get 


Ok _ Ok - 1 _ (- 1 )* 
b k b k _ 1 b k b k _ 1 
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Therefore, since hh - 1 > 0, we conclude that when k is even, q > Ck-i; 
and when k is odd, c* < c*_i ■ ' 

Next, we show that the odd-numbered convergents are strictly in- 
creasing, and the even-numbered convergents are strictly decreasing. 
Using Theorems 14.2 and 14.3, we can write, for k > 3, 


dkbk-2 - bkClk- 2 = ( qk • «*- 1 + ak-2)bk-2 - (<7JC bk- \ + bk-2)Clk-2 
= qk{ttk-\bk-2 - bk-\dk~2) = 1 • 


Dividing by b k b k - 2 , we get 


Ok _ Ok - 2 _ ^(-l)* -1 
bk b k _ 2 ~~ hh-2 

Therefore, since > 0, we conclude that when /c is odd, c k > q_ 2 ; 
and when k is even, < c^_ 2 . Thus the odd-numbered convergents 
are strictly increasing, and the even-numbered convergents are strictly 
decreasing. 

Finally, suppose, by way of contradiction, that there is an odd- 
numbered convergent c 2(+ i greater than some even-numbered con- 
vergent c 2 j. We observed above the way in which the convergents 
alternately increase and decrease. So c 2j+1 < c 2j , which means that 
c 2 j < C n ■ ® ut even-numbered convergents are strictly decreasing, 
and so i < j. Similarly, c 2j > c 2j l , which means that c 2i+1 > c 2 ._ v 
But the odd-numbered convergents are strictly increasing, and so i > 
a contradiction. Therefore, all the odd-numbered convergents are less 
than all the even-numbered convergents. 

This completes the proof of the theorem. ■ 

We end this section on finite continued fractions with a simple 
observation about convergents that will be extremely important in 
the next section when we turn our attention to infinite continued 
fractions. You have surely noticed when computing the convergents 

C\ = c 2 — y 2 c„ = j 11 for a finite simple continued fraction that 

the numerators a k and the denominators b k get large rather quickly as k 
increases. We focus on the denominators in the following theorem. We 
have already observed that b k > 0 for all k > 1, and so, by Theorem 14.2, 
bk > bk-\ for all k > 3; but we can say a good bit more about how fast 
these denominators increase. 

Theorem 14.5. Let [q\, q 2 , , q„] be a simple continued fraction with 

convergents c\, c 2 , . . . , c„. Further, for each convergent Ck, write c k = ‘jf 
where a k is the numerator and b k is the denominator of the convergent and, 
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by definition, b\ = 1 and b 2 = q 2 . Also, let F k be the kth Fibonacci number 
for each k > 1. Then, for each k> 1, b k > F k . 

Proof 

We use a proof by induction. Actually, we will use the form of induc- 
tion called strong induction, which was introduced in the proof of 
Theorem 2.2. Since b k = 1 = F\, we have b\ > F\, as claimed. Also, 
b 2 = <?2 > 0, so b 2 > 1 = F 2 , and b 2 > F 2 , as claimed. Thus the statement 
holds for k = 1 and k = 2. 

Now assume that for some 3 < k < n the statement is true for all 
1 < i < k; in particular, then, we are assuming that the statement is true 
for k - 1 and for k - 2, that is, b k _ k > F k _ k and b k _ 2 > f*- 2 - Then, since 
q k > 1, we have, by Theorem 14.2, 


b k — q k ■ b k _ i + b k _ 2 > b k _\ + b k _ 2 > F k _ \ + F k _ 2 = F k , 


and the statement is also true for k. 

This completes the proof of the theorem. ■ 


Infinite Continued Fractions 

We saw in the previous section that finite simple continued fractions 
represent rational numbers; it seems as though infinite simple contin- 
ued fractions such as 


1 + 


and 1 + 


1 + 


1 + 


1 + 


2 + 


1 + 


1 + 


2 + 


represent irrational numbers such as and s/3. 

This is in fact true. Moreover, the main reason we are interested 
in infinite continued fractions is that they can be used to represent 
irrational numbers and hence, as a by-product, their convergents can 
be used to provide us with good rational approximations to irrational 
numbers. For example, in Problem 3.33 we found an excellent rational 
approximation for jt, and in Problems 4.14 and 4.15 we found rational 
approximations for s/5 and s/7 that are correct to four decimal places. 

Before we investigate infinite continued fractions in any detail, let’s 
review how we approximated s/5 using continued fractions. The idea, 
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of course, is to represent */5 by an infinite simple continued fraction 
[<7i, <72, <73, • • • ], that is, by 


<7i + 


<72 + 


<73 + 


<?4 + 


<75 + 


where the partial quotients </i, <72, <73, . • • , in this case are all positive 
integers. 

We begin by finding the first partial quotient: q\ = [_\/5J = 2. Thus 
[ip 2- j = y/5 - 2, which means that [^2, <73, . . ] = — VS + 2. So we 

can find the next partial quotient: q 2 = + 2J = 4. 

We get very lucky at the next step because we now have Vs + 2 = 
4 + j ^ j ; this can be rewritten as [< 73 , ( 74 , . . ] = = %/5 + 2, and we 

immediately notice this is the same expression as in the last step; hence 
q 2 also equals 4 and we realize that all partial quotients from now on 
will equal 4 because each step will be exactly the same. Therefore, we 
conclude that V5 is represented by the infinite continued fraction 


2 + 


4 + 


4 + 


4 + 


Next, we compute the convergents for this continued fraction in a 
table using Theorem 14.2: 


7c -1 0 1 2 3 4 5 

q k 2 4 4 4 4 • • • 

a k 0 1 2 9 38 161 682 • • • 

b k 1 0 1 4 17 72 305 

Finally, we see that the fourth convergent C 4 = ^ = 2.236 111 . . . 
yields an approximation to Vs that is correct to four decimal places. 
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This example raises a couple of questions. The first is: what does it 
mean to say an infinite continued fraction represents a given number? 
In other words, does it make any sense to write expressions such as 
n/ 5 = [2, 4, 4, 4, . . . |? After all, the continued fraction [2, 4, 4, 4, ... ] 
is by definition infinite, and we can never actually carry out the infinite 
number of additions and divisions involved. Even if we invoke Theo- 
rem 14.2 and extend the above table farther and farther to the right, we 
never reach an end. 

Another question arises concerning using an infinite continued frac- 
tion, as we did above, to approximate a given irrational number. How 
do we know that a particular convergent is correct, say, to four decimal 
places? In the example above, we used c 4 = ^ to approximate >/5 this 
accurately, but did so only because c 4 agrees with the answer given by a 
calculator through four decimal places. 

In order to deal adequately with such questions, we need to clarify 
a few things. So, first of all, an infinite simple continued fraction is an 
expression of the form 


< 7 i + 


<72 + 


<73 + 


<74 + 


ds 


where < 71 , <72. <73, • . . , is an infinite sequence of integers, called the 
partial quotients, such that q, is positive for each i > 1 ; < 71 , however, may 
be positive, negative, or 0 . 

We will again adopt a more compact notation and write infinite 
continued fractions as [<71, <72, <73, ... ]. We define the convergents for an 
infinite continued fraction to be following finite continued fractions: 


<T = [< 7 i]. c 2 = [< 7 i- <72]. <b = [<?n <?2. <73], 


and note that, by Theorem 14.1, the convergents form an infinite 
sequence of rational numbers. 

An extremely important point to observe is that because the con- 
vergents of an infinite simple continued fraction are themselves finite 
simple continued fractions we can apply all of the theorems in the 
previous section to these convergents. 

Next, we say that an infinite simple continued fraction 
[< 71 , <72, <73, . . . J represents the real number x — or, equivalently, write 
x = [q\, <72, <73, . . . ] — if the sequence Ci, C2, C3, . . . , converges to x; that 
is, if lim^oo c k = x. 
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So, with these definitions in hand, our first order of business is to 
prove that any infinite simple 'continued fraction represents a real num- 
ber. In other words, we need to show that the sequence of convergents 
Ci, C2, C3, . . . , always has a limit. This is surprisingly easy, but it requires 
some care. 

The key is to use Theorem 14.4, which guarantees that the sequence 
Ci, c 2 , C3, . . . , although infinite, still satisfies 


Ci < c 3 < c s < ■ ■ ■ < c 6 < c 4 < c 2 . 


Consider the increasing sequence Ci < c 3 < c 5 < . . . consisting of 
the odd convergents. It is clear that this sequence cannot diverge to 
infinity because each number in this sequence is less than c 2 (in this 
situation, we say that the sequence is bounded above by c 2 ). Therefore, 
this sequence must converge to a limit. Let X\ = lim^oo c 2 *_i be that 
limit. Also, observe that since each odd convergent is less than every 
even convergent, it follows that x\ < c 2 k for each k. 

Similarly, the decreasing sequence c 2 > c 4 > C(, > . . . of the even 
convergents is bounded below by c 4 and therefore also converges to a 
limit. Let x 2 = lim^oo c 2 * be that limit, and note that Xz > Czk - 1 for 
each k. 

At this point it is clear that x\ < x 2 , but we still need to show that 
x\ — xz . Well, for any k, we have 


C 2 k-\ < *1 5 Xz < Czk, 


and so, by Theorem 14.3, 


<*2k <*2k-\ 

Xz~X\ < Czk ~ Czk— l — 1 7 

bzk bzk -1 


Cl2kbzk-\ - bzkd2k- 
b>2kb2k - 1 


(-D 

b 2kt>2k - 


2k 


bzkbzk- 


Thus, for any k, we have Xz — X\ < | . But, by Theorem 14.5, 

lim^oo b2 J 2t — 0. Therefore, x 2 = X\, and we conclude that the 
sequence of convergents c 3 , c 2 , c 3 , . . . , has a limit, namely, x = x 2 = X\ . 

So, at this stage, we know that any infinite simple continued fraction 
converges to a real number x. But we can now also conclude that 
x is an irrational number. For, by way of contradiction, assume that 
[<7i, <J 2 , c? 3 , . . . ] is an infinite simple continued fraction with conver- 
gents ci, c 2 , c 3 , . . . that converges to x, and that x — f is a rational 
number. 
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Then, as we saw above, 


0 < C2k - X < C 2 k - C2k-1 = 7 — 7 • 

t>2kt>2k - 1 


Therefore, 

n ci2k a 1 

0 < t < , 

i>2k b b 2 kb2k - 1 

and so 

ba 2k - ab 2k 1 

0 < < , 

bzkb b 2k b 2 k~\ 

which means that 


0 < bd2k — Clbok < t • 

b 2 k - i 

This gives us our contradiction because, since the denominators b 2k -i 
in this expression diverge to infinity by Theorem 14.5, we can choose a 
value for k such that < 1 . Thus, for this value of k, 


0 < ba 2 k - ab 2 k < 1, 


and the integer ba 2 k - ab 2 k is between 0 and 1, which is impossible. This 
contradiction shows that x, the limit of an infinite simple continued 
fraction, is irrational. 

We have therefore proved the following theorem. 

Theorem 14.6. Let [q\, q 2 , q 2 , . . . ] be an infinite simple continued fraction 
with convergents C\, c 2 , c 2 , . . . . Then, the sequence C\, c 2 , c 3, . . . con- 
verges to an irrational number x, and we express this by saying that x = 
[<?i. <72. <73. ■ ■ • ]• 

The converse to this theorem is almost obvious, since the procedure 
we used to find a continued fraction representation of y/S can, in 
principle, be carried out for any irrational number. There is, however, a 
huge difference between being able to carry out something in principle 
and actually being able to do it. Take, for example, finding a contin- 
ued fraction for n. We saw how this could be done in Problem 3.32, 
but this assumed that we already had an accurate decimal represen- 
tation for 7 r (in this problem we assumed we already knew that n = 
3.141 592 653 589 . . . ). Without this additional information about n, 
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finding a continued fraction representation would quickly bog down 
in practice: since we know that |yrj = 3 we can find the first partial 
quotient q\ = 3, but then even to find qz — LyrjJ is challenging, to 
say the least. 

It is certainly worth pointing out that the representation for 
an irrational number as an infinite simple continued fraction is 
unique. In other words, if two continued fractions [<p , qz, qz,, ... ] and 
[pi, P 2 - pi, ■ ■ ■ ] represent the same irrational number x, then q t = p, for 
all i. This follows inductively since it is clear that q ! — [xj = pi . Then 
we have [q 2 , q 2 , q 4 , ■ ■ ■ ] = [pi, pi, pi, • ■ ■ ], and, once again, q 2 = p 2 , 
and so on. 

Note that it is Theorem 14.6 that allows us to evaluate specific infinite 
continued fractions such as 


1 + 

1 


Theorem 14.6 tells us that this continued fraction converges to some 
number x, that is, a: = [1,1,1,...]. Then, we observe as we did in 
Problem 3.36 that x — 1 + — j, giving us the equation x — 1 + 

Solving this equation yields x = showing that this particular 

continued fraction does indeed represent the golden ratio. 

Earlier, we saw how s/~S could be expanded into the repeating infi- 
nite continued fraction [2, 4, 4, 4, ... j. Let’s verify this using Theo- 
rem 14.6. By this theorem we know this continued fraction converges 
to some number x, that is, x = [2, 4, 4, 4, . . . ]. Thus, 


[4,4,4,...] 1 4 + (x-2) x + 2' 

4 H 

[4, 4, 4, ... ] 

which becomes x 2 - 4 = 1, and so x 2 — 5, and x = \/5 as expected. 

In this same way, it is routine to find the number to which any 
repeating infinite continued fraction converges. For repeating contin- 
ued fractions we introduce a useful new notation. Since writing the 
continued fraction for \/3 as [1, 1, 2, . . . ] or even as [1, 1, 2. 1. 2, ... ] 
does not make clear exactly how the continued fraction repeats itself, 
we use the notation [1 , 1 , 2 ] to indicate which group of numbers repeat 
themselves over and over again. 
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You may have noticed that all of the repeating continued fractions 
we have seen so far represented numbers that involve a square root. For 
example, 


[1, 1,2] = V3, [2, 4 ] = >/ 5 , [2, 1, 1, 1, 4 ] = V7, and [1 ] = — 

Such numbers are called quadratic surds since they are irrational solu- 
tions to quadratic equations with integer coefficients. 

Let's do an example to see why any repeating simple continued 
fraction is of the form that is, a quadratic surd. Note that in this 
expression we may as well assume that c is squarefree. So, consider the 
continued fraction [1, 2, 3, 4, 5 ]. By Theorem 14.6, [1, 2, 3, 4, 5 ] = x 
for some irrational number x. Our goal is to show that x is a quadratic 
surd. 

To find a value for x, we first find a value for y where y = [ 3, 4, 5 ]; 
that is, y is the value of the purely periodic continued fraction [ 3, 4, 5 ]. 
Since 


[3, 4, 5] 


we have the equation 


= 3 + 


5 + 


1 


[3, 4, 5] 


v = 3 + 


1 


1 

4 + 

1 

5 H — 
v 


Solving this equation for y, we get 


y = 3 + 


1 


4 + 


v 

5>- + 1 


= 3 + 


5y + 1 

217+4’ 


so y - 3 = |7f4 > which becomes 21 y 2 

solution to this quadratic equation is y 
Now, since 


- 64y — 13 = 0, and the positive 

_ 32++1297 
— 21 


[1, 2, 3. 4, 5] = 


+ 


1 


[3, 4, 5 
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we can write 


32 + v / 1297 2 + 32 + 71297 

21 

32 + 71297 117 + 371297 

+ 85 + 271297 ~ 85 + 271297 

117 + 371297 85-271297 2163 + 21 71297 

_ 85 + 27T2W ' 85-271297 _ 2037 

103 + 71297 
" 97 ' 

and, as expected, x is a quadratic surd. 

Since this same process can be applied to any repeating simple con- 
tinued fraction and will always lead to a quadratic equation, we can 
state the following theorem. 


x = l + 


2 + 


Theorem 14.7. Any repeating simple continued fraction — that is, a contin- 
ued fraction of the form \c]\ , q 2 , ■ ■ ■ , +»■ p i , /;„ ] — is a quadratic 
surd; that is, it is an irrational solution to a quadratic equation having integer 
coefficients. 


Proof 

By Theorem 14.6, we can let y = [ pi, p 2 p n ] ■ Then, since this con- 
tinued fraction repeats the terms p\, p 2 pn over and over again, we 

have y = [p\, p 2 , ... , p n , y]. Note that this is a finite continued frac- 
tion, although not a simple continued fraction because y is irrational. 
However, the argument presented in the proof of Theorem 14.2 can 
still be used on this continued fraction. Once again we write the first 

n convergents for [pi, p 2 p„, y] as §}-, g, . . . , gc 

Then, by Theorem 14.2, the nth convergent is given by 

Pn ' ttn—l 4 “ &n— 2 m 
Pn ' bfj—i + b n — 2 
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and, since y can be found by replacing p„ in the continued fraction 
[pi. p 2 pn] by pn + we get 


_ _ ( pn + }) ■ fln-1 + «//- 2 _ (ypn + l) ■ fl„_i + yfl,i - 2 

(pn + j) ' Pn- 1 + Pn- 2 (yPn + l) ' Pn- 1 + yPn-2 

_ y(Pn ■ (In - 1 + A/i- 2) + A//-1 _ >' • Oft + fl n - 1 

yipn ' Pn - 1 "F P/i- 2) + P/i- 1 y ■ P /; + P/i-l 

which we rewrite as 

P/i 1 + (P/i-i — fl//)y — fl/i-i = 0. 

Thus y is a quadratic surd. 

Again, by Theorem 14.6, let * = [t/i, <j 2 , . . . , q m , pi, p 2 , . . . , p„ ]; 
that is, let x = [91, <72. • ■ • » Pm, >’]• So we write the first m convergents 
for [qi, q 2 , . . . , q,„. y] as ^ and we conclude, as before, that 

y • c m + Cm - 1 
— 

y * d„i T dftj—i 

and x is a quadratic surd since y is a quadratic surd. 

This completes the proof of the theorem. ■ 


It is perhaps worth reviewing the results in the proof of this theorem 
in terms of the example we used to motivate the theorem. So, for the 
continued fraction [1, 2, 3, 4, 5 ] considered above, we compute the 
first three convergents for v = [ 3, 4, 5 ] in the table 


k - 10 12 3 

Pk 3 4 5 

a k 0 1 3 13 68 

P k 1 0 1 4 21 


and see that P3 = 21 , P 2 - fl3 = -64, and a 2 = 13, which as we saw in the 
proof means that y is a quadratic surd that is a solution to the equation 
21v 2 - 64y - 13 = 0, exactly as we concluded before. 
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Then we compute the first two convergents for x = [1, 2, y] in the 
table 


k -1 0 1 2 

Qk 12 

c k 0 113 

d k 1 0 12 


and we see as in the proof that 

y-3 + 1 
X ~ v • 2 + 1 ’ 


and since y — 32+ ^ 1297 , we get 

3(32 +^ 297) + | 117 + 3^/1297 103 + Vl297 

A 2( 32+ ^^) + 1 ~~ 85 + 2 v / 129 7 “ " ‘ “ 97 

exactly as we concluded before. 

The converse of Theorem 14.17 is also true. This was proven by 
Lagrange. We state this converse as a theorem, but omit any proof. 

Theorem 14.8 (Lagrange). Any quadratic surd x can be represented by a 
repeating simple continued fraction, that is, 

x = [q\. q2, ■ ■ ■ , qnu P 1 ■ p2» ■ ■ ■ • pn ]■ 


Approximation 

We mentioned earlier that the main reason we are interested in in- 
finite continued fractions is that they can be used to provide good 
approximations for irrational numbers. We begin our brief discussion 
of approximation with a theorem that we all but proved while we were 
proving Theorem 14.6. This theorem provides an excellent estimate of 
how closely a given convergent approximates the value of a continued 
fraction. 

Theorem 14.9. Let x be an irrational number represented by the infinite 
simple continued fraction \qi, q- 2 - qs, ■ ■ ■ 1, where as usual we write c k = f- 
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for the kth convergent. Then, for each k, we have 

a k 1 

X — — < 

hk b k b k+ 1 ' 


Proof 

We will use essentially the same argument used to prove Theorem 14.6. 
First, we assume that k is odd; hence we have 


C k < X < C/c+l, 


and so, by Theorem 14.3, 


a k 

X- — = x-c k < c k+1 - c k 
b k 


bk+i 


Uk 

b k 


a k+l b k - b k+l a k _ (~1)* +1 1 

h k b k+ i b k b k+ 1 b k b k+ i 


Next, assume that k is even. The argument is almost identical. We have 


C k + 1 < x < c k . 


and so 


x 


Ok 

b k 


= c k — X < C k — C k+ 1 


a k 

b k 


a k +\ 
b k+ 1 


(a k+ \b k — b k+ \a k \ 


V 


b k b k+ 1 


— (— 1)* +1 _ 1 
b k b k+ 1 b k b k+ i 


This completes the proof of the theorem. 


Note that all this proof really amounts to is observing that the 
distance between two convergents c k and c k+ 1 is and, since x is 
between c k and c k+ \ , the distance between c k and x is less than . 

In Problem 4.14 we used the continued fraction [2, 4 ] to approx- 
imate s/5 to four decimal places. There, we found that the fourth 
convergent, correctly approximates s/5 to four places. However, we 
could have known this at that time only because our calculator gave 
us the same value 2.2361 for and for s/5. In other words, we had 
to know the value of V5 ahead of time. But, by Theorem 14.9, we can 
be assured that ~ correctly approximates s/5 to four decimal places 
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without having any prior knowledge of the value of >/S. All we need 
to do is calculate the fifth convergent and then use the theorem to 
conclude that 


161 

~72 


- V5 < 


1 

72 • 305 


.000046, 


which means that ^ is a rational approximation to s/5 that is accurate 
to four decimal places. 

We now turn our attention to the main idea in this section: not 
only can infinite continued fractions be used to approximate irrational 
numbers, but they yield the best possible rational approximations. It is 
no accident that throughout history rational numbers such as “ and 
fff have been used repeatedly as excellent approximations for rt. 

Theorem 14.9 tells us in a very precise way that a given convergent 

Y is a good approximation for an irrational number x. Our goal in the 
remainder of this section will be to show that the only way to improve 
this approximation would be to use a fraction g where the denominator 
b is larger than b„. This is the sense in which we consider a convergent 

Y to be a best approximation because the only way to get a better 
approximation is to use fractions with larger denominators, and it turns 
out that the next fraction that is better is £tti. 

However, before we can record this important result about conver- 
gents, we need to prove a fact about convergents that may have seemed 
obvious all along. For an infinite simple continued fraction converging 
to x we know that the even convergents get closer and closer to x and 
also that the odd convergents get closer and closer to x. What we don’t 
yet know — although we undoubtedly already believe it— is that each 
convergent gets closer to x than the previous convergent. This is what 
we now prove. 


Theorem 14.10. For an infinite sitnple continued fraction converging to 
an irrational number x, each convergent is closer to x than the previous 
convergent 


Proof 

Let x = [<71 , q 2 , q 2 , . . . ] be an infinite simple continued fraction with 
convergents a a . . . . 

Now, as we did in the proof of Theorem 14.17, express x as 


x = [<?1. 02, <?3, • ■ ■ , q rt, >'] 
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where y — [q n+ \, q n+ 2 , q„+ 3 , . . . ]. Then, as we saw before, we have 

X y^n T flfj—i 

yb n + £>, 7-1 


This equation leads to 


y(xb n 


u,i) — cifj — 1 xb n -\ 



fln—l \ 

£>, 7-1 ' 


Thus, dividing by yb n , we get 



But since £>„_ 1 < £>,„ and y > 1, we conclude that 


u n 


c 


< 

- Vi 


This completes the proof of the theorem. ■ 

Now we are ready to prove the main theorem in this section, which 
says that it is the convergents in infinite simple continued fractions that 
provide us with the best possible rational approximations for irrational 
numbers. 


Theorem 14.11 . Let x = [q\, q 2 , q$, . . . ] be an infinite simple continued 
fraction with convergents g, .... Let n > 1, and let l be a fraction 
with b > 0 such that 


a 


u n 

b~ X 

< 

K~ X 


Then, b > b n . 


Proof 

Without loss of generality we will assume that x > 0. There are two 
cases. We begin with the simplest case. 


Case 1: We assume that either x < l < or ^<f< x.Thus 


U n 

a 


U n 

b n 

b 

< 

b,~ X 


< 


1 

£>77 £> 77+1 
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by Theorem 14.9. So, multiplying by bb n , we get 


0 < I ba n — ab n 


b 

bn + 1 


But ba„ - ab n is an integer, so b > b n+ \ > b n , as desired. 

Case 2: We assume that either ^ < x < ‘fy or ^ < x < ^. But by 
Theorem 14.10, this means that either 


Thus 


Clyi — 1 d Clyi dfl d dfl | 

< 7 < X < T- or — < X < 7 < 

bn — i b b n b n b b n — i 


a n -\ 

a 

< 

U n -\ 

V 

bn — 1 

~ b 

K- 1 


i 


bn—\bn 


by Theorem 14.9. So, multiplying by bb n -\ , we get 


0 < \ba„_ x - ab n -i\ < 

U n 


But ba n ^ i - i is an integer, so b > b n , as desired 


This completes the proof of the theorem. ■ 


Here is another result on approximation that is quite similar to 
Theorem 14.11. 


Theorem 14.12. Let x = [qi, q 2 , £ 73 , ... ] be an infinite simple continued 
fraction with convergents g-, fj, g, . . . . Let and be two consecutive 
convergents; then the inequality 


ar 

b k 


x 


< 


1 

Z% 


is satisfied by at least one of the two convergents, that is, fork = nor k = n+ 1 . 


Proof 

This proof relies on a simple algebraic inequality, which we prove first. If 
a ± b, then ( b - a) 2 > 0, and so 2 ab < b 2 + a 2 . Now, as long as ab ^ 0, we 


Continued Fractions 


407 


can divide this last inequality by 2 a 2 b 2 , and get the inequality we need: 

JL J_ 1 

ab < 2a z + W' 

We give a proof by contradiction. First, note that although we don’t 
know which convergent is larger, we do know that x is between ‘-jf and 
fyj; thus the distance between the two convergents is equal to the 
distance from x to g* plus the distance from x to that is, 

A V„+i 1 


&ti+ 1 


a n 

+ 

&n+ 1 

' bn + 1 b n 


i x 

b n 

h x 
Vn+ 1 


&n + 1 

b „+ i 


b„b„. 


Also, as we have seen before, 

So, by way of contradiction, assume the conclusion of the theorem is 
false. Then we get 


1 

U n 


1 

&n+ 1 

bnbn+ 1 

b n 

— X 


bn+\ 


1 

2 K 


1 


which is impossible by the basic algebraic identity we derived above. 
This completes the proof of the theorem. ■ 


We end this section on approximation with a theorem that tells us 
that if we have a rational number that is a good approximation to an 
irrational number, then the rational number is in fact a convergent! 

Theorem 14.13. Suppose that x is an irrational number, and that g is a 
rational number — in lowest terms and with b > 0 — such that 

a 1 

b < W 

then | is a convergent of the continued fraction that represents x. 


Proof 

Let d be the distance between g and x. Thus, 


: - X 


= ±d depending on 


whether g is greater than, or less than, x. Also, note that 0 < d < 
Next, write g as a finite simple continued fraction 


2b 2 * 


a 

b = bill ■ • • » ] 
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with convergents gj-, g*, . . . , g 2 Recall that, in this case, we have for 
the last convergent g* = f. Furthermore, recall from Problem 14.5 
that when we produce a continued fraction for a rational number we 
can choose to have either an even or odd number of terms. So for the 
continued fraction [qi, q 2 , ■ ■ ■ , q n ] we will choose n to be even if g - x 
is positive and choose n to be odd if g - x is negative. 

Now, let y be the irrational number y = 1 • It is routine to 

verify that 

ycin T &n- | 

X = 

yb n + b n —\ 

by plugging the value for v into the right side of this expression and 
then simplifying. 

We can therefore now write x as the (nonsimple) finite continued 
fraction 


— [qi" ^ 2 , • • • , qni y ]• 

Then we write y = [q n+ i, q n+ 2 . q n+ 3 , . . . ] as an infinite continued 
fraction; thus we can also express x as the infinite continued fraction 

= [r? 1 ’ • • • - q n- q n +l, qn+2< qn+ 3 , ■ • • 

We have now accomplished exactly what we want, because we can 
see that g is a convergent of this continued fraction representing x since 
f = jg and gj is the /7th convergent for this continued fraction. 

There is, however, one small detail left to verify. We need the contin- 
ued fraction for x above to be a simple continued fraction. In order for 
this to be the case, we therefore need for v > 1, which will ensure that 
all of the partial quotients < 7 „ +1 , q n+2 , q„+ 3 , . . . are positive. 

To attend to this detail, we write 

^ ^ ^11 ^ ^ !! y a n+ Q-n—i a„yb„+ a„b„ \ b n yci n b n u n —\ 

b b n b n yb„+b n _i b n (yb„ + b n - 1 ) 


a ub„ 1 b n u n — 1 (—1)” 

b„(yb„ + b n 1 ) b n (yb„ + b n - 1 ) 

But we chose n to be even or odd according to whether g - x is positive 
or negative, and we can therefore conclude that 

d = , 

bniybn H - bn— 1) 

where d was defined above to be the distance between f and 
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Solving this equation for y yields 

1 bn— 1 

But > 2 (because we saw above that d < jp), and ^ < 1 (because 
the denominators of convergents always increase); therefore, y > 1, as 
desired. 

This completes the proof of the theorem. ■ 


Pell's Equation 

We introduced Pell’s equation 


x 2 - ny 2 — 1 

in Chapter 4 as an example of a Diophantine equation. The solution 
x = 577, >’ = 408 of the Pell equation x 2 - 2 y 2 = 1 was used in India in 
the fourth century to produce the fraction f^g as an excellent rational 
approximation for s/2. 

It is easy to see why solutions to Pell’s equation can be used to approx- 
imate sfn — this was known to Archimedes, who used this method to 
find approximations for square roots. If x and y are two positive integers 
such that x 2 - ny 2 = 1 , then it follows that ~ = ^jn + and for large 
values of y the right-hand side is very close to Jn. 

But how do you find a solution such as x = 577, y = 408 for a Pell 
equation? Well, we saw in Chapter 4 that one way to do it seems to be to 
use continued fractions. Since s/2 can be represented by the continued 
fraction [1, 2], we can compute the convergents for this continued 
fraction and get 

1 3 7 17 41 99 239 577 
I’ 2’ 5’ 12’ 29’ 70’ 169’ 408 

and there is the solution x = 577, y = 408 for the equation x 2 - 2 y 2 = 1 
just sitting right there as a convergent! 

In fact, all of the even convergents seem to produce solutions, so 
x = 3, y = 2 is a solution, x = 17, v = 12 is a solution, x = 99, y = 70 is 
a solution, and so on. 

Before we begin our brief study of Pell’s equation we should note that 
it is customary in Pell’s equation x 2 - ny 2 = 1 to assume that n is a 
positive nonsquare integer. Clearly, if n is not positive, there are only 
trivial and uninteresting solutions such as x = 1, y = 0. Similarly, if 
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n = m 2 is a square, then x 2 - ny 2 — x 2 - rrf-y 2 = (x - my)(x + my) = 1, and 
so x - my = x + my = ±1; hence the only solutions are x = ±1, y = 0, 
which again is uninteresting. 

Our first result about Pell’s equation says that the only place we need 
to look for solutions to x 2 — ny 2 = 1 is among the even convergents of 
the continued fraction for ~Jn. 

Theorem 14.14. If a, b > 0 satisfies Pell's equation x 2 - ny 2 = 1, then is 
an even convergent of the continued fraction representation for s /n. 


Proof 

We rewrite a 2 - nb 2 = 1 as ( a - bjn)(a + b*Jn) = 1. Since a, b > 0 we 
have a + b^fh > 0, which means that a - bjh > 0 as well; thus a > bjh. 

We now write the equation (a - byfn)(a + b^fh) = 1 in the form 
5 - Vn = and, we conclude that 

0 ,<* r 1 , 1 1 J_ 

b n b(a + b^/h) < b(b^ri + b^fn) 2 b 2 ^h < 2b 2 ' 

Therefore, by Theorem 14.13, f is a convergent of the continued fraction 
representation for ^Jh, and in fact is an even convergent since f > sfh. 
This completes the proof of the theorem. ■ 


Since solutions to Pell’s equation x 2 - ny 2 = 1 are always convergents 
of the continued fraction for ^h, we clearly need to learn more about 
such continued fractions. By Lagrange’s theorem, Theorem 14.8, we 
know that the continued fraction for *fh will be a repeating simple 
continued fraction. But we can in fact say a good bit more about these 
continued fractions. 

Here are the continued fractions for the square roots of the first 
twenty nonsquare positive integers: 


72 = [ 1 . 2 ] 

76 = [2, 274] 

7T0 = [3, 6] 

713 = [3. 1. 1. 1, 1. 6] 
717 = [4, 8] 

720 = [4, 278] 

723 = [4, 1. 3, 1. 8] 


73 = [1, 172] 

72 = [2, T7T7T74] 

Til = [3, 376] 

714 = [ 3 , 1 , 2 . 176 ] 

718 = [4, 478] 

721 = [4, 1 . 1,2, 1 , 1.8] 
724 = [4, 178] 


75 = [2, 4] 

78 = [2, TTt] 

7T2 = [3, 276] 

715 = [3, 176] 

719 = [4, 2. 1. 3, 1, 2. 8] 
722 = [4. 17X77X778] 
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There are several interesting things to notice about these continued 
fractions. The most obvious thing is that the repetition always begins 
with the second term. Also, the first term is just which of course 
is true for any continued fraction. Another interesting thing is that the 
last term is always twice the first, so, for example, 723 = [4, 1, 3, 1, 8] 
begins with 4 and the last number in the group of four numbers that 
repeat is 8. Finally, notice that, except for this last number, the numbers 
that repeat form a symmetric sequence, so, for example, in 7l9 = 
[4. 2, 1,3, 1, 2, 8 ] the five numbers 2, 1, 3, 1, 2 are symmetric and in 
723 = [4, 1, 3, 1, 8 ] the three numbers 1, 3, 1 are also symmetric. 

In order to explain some of this, let’s begin by looking at purely 
periodic continued fractions. We start with an example: the purely 
periodic continued fraction y = [ 3, 4, 5 ] we used earlier as motivation 
for Theorem 14.17. We discovered, first of all, that y is a quadratic surd 
and is the positive solution to the quadratic equation 

21 v 2 - 64y - 13 = 0; 


in particular, then, 


y = 


32 + VI297 
21 


Now, let’s see what happens when we consider the continued frac- 
tion where we reverse the period; that is, the continued fraction becomes 
z = [ 5, 4, 3 ]. We analyze this exactly as we did before. Since 


5, 4. 3] = 5 + 


4 + 


1 


[5. 4, 3] 


we have the equation 


z = 5 + 


1 


1 

4+ 

1 

3 + - 
z 


Solving this equation for z, as we did before, we get 


z — 5 + 


1 

z 

4 + 


— 5 + 


3z+ 1 
13z + 4' 


3z+ 1 
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so z - 5 = ^+ 4 , which becomes 

13 z 2 - 64z - 21 = 0, 

and the positive solution to this quadratic equation is z = 32 +^p97 . 

This is not too surprising given the symmetry of the two quadratic 
equations, but here is where things get interesting. If we compute the 
negative reciprocal of z, we get 

1 _ -13 -13 32 - 71297 

z _ 32 + 71297 _ 32 + 71297 32 - 71297 

_ -13(32- 71297) 32-7129 7 

-273 ~ 21 ' 

In other words, it turns out that is the conjugate of y. 

What do we conclude from all this? It is clear that y, represented 
by the purely periodic continued fraction [ 3, 4, 5 J, is greater than 1. 
Similarly, z is also greater than 1. Therefore, the conjugate of y, which 
turned out to be is between -1 and 0. This leads us to our next 
theorem. 

Theorem 14.15. Let y — [ q[ , q 2 q„ ] be a purely periodic simple 

continued fraction; and let z — [ q n . q„ .... q x ] be the purely periodic 
simple continued fraction with the order of the partial quotients in the period 
reversed. Then, 

( 1 ) y > 1; 

(2) is the conjugate ofy; and 

(3) -1 < < 0. 

Proof 

Since [ q x , q 2 q„ ] is a purely periodic simple continued fraction, 

q\ — q n + 1 > 1; therefore, y > 1 , which verifies statement (1). For exactly 
the same reason, z > 1 , which verifies statement (3). 

Next, we note that by Theorem 14.17, y is a solution to a quadratic 
equation having integer coefficients, and so statement (2) amounts to 
saying that --7s the other solution. 

First we consider the case n = 1. This means that y = [ qf ] = z. This 
also means that y satisfies the equation y — q x + = q x + *, which 

can be rewritten as 


y 2 - qiy - l = o. 
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Dividing this equation by y 2 yields 4 + - 1 = 0, and so 

and -i is the conjugate of y. Since z = y, this verifies statement (2) in 
this case. 

Now we assume n > 1 and, as usual, we let fj-, g, . . . , f be the 
convergents for the finite continued fraction [qi, q 2 , ... , q„], and to 
avoid confusion we let 22 , . . . , ^ be the convergents for the finite 

continued fraction [ q n , q n _i q^\. 

We claim that ^ = [q n . q„-\, . . . , qi]. First, note that 


02 

a\ 


q\qi + 1 

<71 


= <72 + 


1 

<71 


= [<?2. <?i]. 


Now, for n > 3, a n = q n a„-\ + 2 , and = q n -\a n _ 2 + a„_ 3 , and so 

on, so we have 


Un 


fin - 1 


1 

qn H 

®n— 1 


fl«-2 


1 

<7n H 

qn - 1 H 

0 / 7—2 


0 / 7— 3 


1 

... — q n -\ - — 

<7n-i H 

. 1 

a 2 


Cli 


— qn 


— qn 


<7;h1+“ 


qih- 1 +- 


— [?«> t) • • • , ^ 1 ], 


<7l?2+l 

<?i 


<72 + 


1 

<7i 


as claimed. Similarly, we can show that = [g„, . . . , g 2 ]- 

We conclude that 


a n , 

= — - and = . 

*2/7—1 P/7 P/7 — 1 fin— 1 


But since convergents are fractions that are always in lowest terms, it 
follows that 


0/7 — ^n-> fin — 0'/7_i, 


0/7-1 — fin-> fin- 1 — fin— l • 
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Now, as we have done several times before, we can write 
y = [<?n • • ■ , tfn, v], 

and so 

y&n T ®n— 1 

yb fl + b n —\ 

hence 

bn y + (bn - 1 — &n)y ~ ttn—l = 0 

is the quadratic equation solved by y. 

We can repeat this same process for 

Z — [<?h, (\n- 1> • • ■ , -Z], 

and we get 

ZOtn Otn— 1 ZC7„ bn 

Zftn T bn — 1 ZU,, | + b n —\ 

which we can rewrite as 

Cln—lZ "T (bn—] t?„)z b n = 0. 

Dividing this equation by z 2 yields 

a n - 1 + (bn-i - «»)(“) - = °- 

or, equivalently, 

^ ^ T (bn- 1 U„) ^ ^ — fl/i-l = 0, 


and solves exactly the same quadratic equation that y solves! 
Therefore, is the conjugate of y, which verifies statement (2). 

This completes the proof of the theorem. ■ 

Our next goal will be to prove the converse of Theorem 14.15. Then, 
eventually, we will come back and see what all this has to do with Pell’s 
equation x 2 - ny 2 — 1 . At this point is would be good to look at an 


Continued Fractions 


415 


example in order to begin to understand why conditions (1), (2), and 
(3) in Theorem 14.15 are important. So let’s consider the quadratic surd 

1 + 77 

First of all, note that y does satisfy (1) and (3) in Theorem 14.15 since 
y > 1, and the conjugate satisfies -1 < < 0. 

Now we go through the process of expressing y as a continued 
fraction. Feel free to ignore the details of the calculations and just focus 
on the overall pattern that emerges. We begin by renaming v = y\ 
(because it is the first quadratic surd in this process); so 


vi 


77 -2 + 77 1 


1 

7 


-2 + 77 


1 + 


1 

2 + 77 ' 

1 


We now let y 2 = the second quadratic surd in this process. 

Note that y 2 also satisfies conditions (1) and (3) in Theorem 14.15; 
moreover, the irrational part of y 2 still involves 77. 

Next, we repeat this process with >>2, and we get 


2 + 77 „ -2 + 77 

B=- r -= 4 +— — 


= 4 


-2 + 77 


1 

2 + 77 ' 

3 


So let » = +2. and note that >>3 satisfies conditions (1) and (3) in 
Theorem 14.15 and still involves 77. 

Continuing, we have 


2 + 77 l -1+77 


1 

3 

— 1 + 77 


1 + 


1 

1 +77 ’ 
2 


So let y4 = 1 + p , and note that 34 satisfies conditions (1) and (3) in 
Theorem 14.15 and still involves 77. 

The next step is the interesting one: 


1 + 77 1 -1 + 77 i 

V4 — 2 , — 1 H — — 1 + 


1 + 77 


2 


2 


-1+77 


3 
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So let ys = ^2. But now we notice that y s = yi and so, if we were to 
continue the process, we would just be in a loop where y 6 = y 2 , y 2 = >’ 3 , 
>’s = V 4 , yg = ys = >’ 1 , and so on forever. 

In other words, what we have shown is that 


vi = 1 + 


1 

,V2 


= 1 + 


1 


1 

4+ - 
3b 


= 1 + 


1 


1 

4+ 

1 

1 + — 
34 


1 + 

4 + 


1 


1 


1 

1 + 

1 

1 + — 
3's 


but since ys = >’ 1 , we know that this continued fraction will now simply 
repeat itself from this point on; that is, 


y = 


1 + 77 


= [1.4, 1, 1 ]• 


Therefore, what seems to be the case is that if we have a quadratic 
surd y that satisfies conditions (1) and (3) of Theorem 14.15, then 
the continued fraction for y is purely periodic. Before we prove this, 
however, let’s see why, in this example, as we computed >4 , y 2 , y 3 , . . . , 
we necessarily had to reach a point where some y * was the same as a 
previous y ; -. The reason is that there are only finitely many quadratic 
surds of the form that satisfy (1) and (3), and so repetition is 
inevitable. In this case, with 77 there are only six quadratic surds that 
satisfy (1) and (3): 


1 + 77 1 + 77 2 + 77 2 + 77 2+77 2+77 

2 ’ 3 ’ 1 ’ 2 ’ 3 ’ 4 ' 

Let’s prove this key fact in general. Suppose is a quadratic surd 
that satisfies conditions (1) and (3) of Theorem 14.15. We may as well 
assume that d > 0 (this is because in the quadratic equation ax 2 + 
bx + c — 0 we can always assume a > 0). Next, since > l and 
> -1, we see that 


2m m +7 n m— «Jn 

d + d > ’ 

and so, m > 0. But n '~/" < 0, so m < 7 n. Therefore, 

0 < m < 7 n. 


and there are only a finite number of possible values for m. 
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Turning now to d, we see that since > 1 , it follows that d < 
m + s/n; and since > -1, it follows that d > yjn — m. Therefore, 

JTi - m < d < *J7i + m, 

and there are only a finite number of possible values for d for each value 
of m. 

We conclude, then, that for each nonsquare positive integer n there 
are only a finite number of quadratic surds n -^ 1 that satisfy condi- 
tions (1) and (3) of Theorem 14.15. Of course, in order to make use of 
this fact we need to know that each of the successive quadratic surds 
vi. y 2 , >’3 . . . involves the same s/n in the irrational part. Let's verify 
this. 

We do this using induction. First of all, we begin with y = and 
write this as y\ = mi Now, assume that, for some k > 1, y k is of the 
same form, that is, y k — ■ Now y k satisfies a quadratic equation 

ax 2 + bx + c = 0, where a > 0. But, by the way in which we constructed 
the sequence y lt y 2 , V 3 , . . . , we have y k - q k + where q k = [y k ]. 
Thus we can rewrite the quadratic equation as 

a (dk~\ 'j H ) + C = 0. 

v y/t+i 7 v yk+ 1 7 

Multiplying by y 2 +] , this becomes a quadratic in >^ +1 : 


(aq k + bq k + c)y 2 +1 + (2 aq k + b)y k+ \ + a — 0. 


But then, for this quadratic equation, we compute the irrational part, 
which is 


(2 aq k + b) 2 - 4(aq 2 + bq k + c)a = b 2 - 4 ac = n. 

and so y k+ 1 is also quadratic surd of the form mt+ J k + / ri • 

The other key fact that emerged in our example was that as we 
computed the successive quadratic surds yi . V 2 , V 3 , . . . , each satisfied 
conditions (1) and (3) of Theorem 14.15. Let’s prove this. So, assume that 
y k is a quadratic surd satisfying (1) and (3). We need to confirm that y k+ \ 
also satisfies conditions (1) and (3) of Theorem 14.15. Condition (1) is 
automatic since by definition 0 < ~ = y k - LyC < 1 , and so y k+ \ > 1 . 

Since condition (3) says that the conjugate of y/t+i is between -1 and 
0 , it will be useful to have a notation for conjugates, so we will write y' k+l 
and y' k to represent the conjugates of y k+ \ and y k , respectively. From the 
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definition of y^+i we have 


1 

= % - yk . 

yk+i 


and, taking the conjugate of both sides (see Problem 14.28), we get 


1 

y'k+i 


= qk- y' k - 


But qk > 1 and y' k < 0, so > 1, which means that y' k+1 < 0, and 

also that y^ +1 > -1. Therefore, y^+i satisfies condition (3). 

With these preliminaries out of the way, we are now ready to prove 
the converse of Theorem 14.15. 


Theorem 14.16. Let y be a quadratic surd such that y > 1 and such that its 
conjugate y’ satisfies -1 < y' < 0. Then y is represented by a purely periodic 
simple continued fraction y = [ t/i, q 2 qk ]. 

Proof 

Writing y = y\ — , we let 


Vi = q\ -\ , where qi = LviJ, 

V2 

thus defining y 2 . We continue in this way always defining v *+ 1 in terms 
of yk- 

As we have seen, each quadratic surd yy defined in this way satisfies 
yk > 1, and its conjugate y’ k satisfies -1 < y’ k < 0. Moreover, each of 
these quadratic surds yk is of the form 

But since there are only a finite number of quadratic surds of this 
form for each nonsquare integer n, the sequence y\ , y 2 . y$, . . . must 
eventually repeat itself. Let us suppose that the first repetition occurs 
at yk, and that y* = y ; -, where / < k. We will show that j = 1. 

Suppose, by way of contradiction, that / > 1 . Then we have 

, 1 1 

yk - 1 — qk - 1 H an( J yj - 1 — dj-i H — ■ 

yk yj 

Since j- = if we can show that qk- \ = qj-i, then it will follow 
that yk - 1 = v/_ i , which will be a contradiction since the first repetition 
occurs at yk- Therefore, our goal is to show that qk - 1 = . 
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Taking conjugates of y k _ k = q k _i + A yields y' k _ j = q k _ k + jr, which 
we can write as 


1 


yk 


tfk - 1 + (— yJk_i)- 


But we know that -1 < y k _ k < 0, which means that 0 < -y' k _^ < l.We 
conclude that q k -\ = . 


In exactly the same way, we can also conclude that qj_\ 
Hence, since y' k — y '■ , we have 



qk - 1 = 






and q k ] = g ; _i , as desired. This gives us our contradiction, and so / = 1 

after all; therefore, v = [q\, q 2 qk ]• This completes the proof of 

the theorem. ■ 


We are now — at long last — ready to explain the nature of the con- 
tinued fractions for 44 where n is any nonsquare positive integer. Take 
n = 23, for example. By Lagrange’s theorem, Theorem 14.8, we know 
that 423 is represented by a repeating continued fraction, but since its 
conjugate, - \/23, does not lie between - 1 and 0, we know by Theorem 
14.15 that the continued fraction is not purely periodic. 

Notice what happens when we add 4 to 423. Since 4 = [423 J , we 
now have a quadratic surd 4 + 423 that is greater than 1 and whose 
conjugate, 4 - 423, automatically lies between -1 and 0. Therefore, 
by Theorem 14.16, the continued fraction for 4 + 423 will be purely 
periodic. 

And it is. Recall from our previous list of continued fractions for 42 
through 424 that 


423 = [4, 1. 3, 1, 8], 


and so 


4 + 423 = [8, 1, 3, 1 J, 


which is purely periodic. Note that the first term in [4, 1, 3. 1, 8 ] is, of 
course, [423J = 4, which explains why the last term in the period for 
[4, 1,3, 1 , 8 ] is twice the first term. 

Let’s verify this in general. Let n be a nonsquare positive integer. Then 
write sfn = [q j, q 2 , . . . ] where q\ = |_44l- But y = + ^/nis a quadratic 
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surd such that y > 1 and its conjugate y' = q x — ^fn lies between -1 and 
0. Therefore, by Theorem 14.16, the continued fraction for y is purely 
periodic, and we can write y as 


y = qi + Vn= [ 2q x , q 2 . <73 qk ]• 

Thus 

Vn = [< 71 . qz, qz, • ■ • , qk, ^q\ 1 - 

We now know the basic structure for a continued fraction represent- 
ing ,/». But we also previously observed that these continued fractions 
contain a lovely symmetry in that the terms q 2 , q 3 , , qk are symmet- 

ric. It is now easy to explain why this is so. 

Using the terminology of Theorem 14.15, we see that the conjugate 
of y is y' = q x - Jh = where z = [q k , q k _ 1( . . . , < 72 , 2q x j. 
But, z — — \ J — . Thus, by Problem 14.6, 

* y' Jn-q\ y J 1 

yfn-q\ = [ 0 . q k , q k _ x , . . . , q 2 , 2q\], 

So 


\/» + q\ — I 2 t/i , q k , q k ~ i <72 ]• 

Therefore, we have just concluded that 


[ 2 < 7 i , q 2 , < 73 , . • ■ , qk ] = [ 2q k , q k , q k - 1 . • • • • <?2 ], 

and so we see that in fact q 2 — q k , <73 = q k - 1 , and so on. 

We record these observations as a theorem. 

Theorem 14.17. Let n be a nonsquare positive integer. Then v /// is repre- 
sented by a periodic continued fraction of the form 


Iqi, < 72 . qz, ■ ■ ■ , qk, 2q x ], 
where q 2 = q k , q 3 = q k _ x , 

Now that we know exactly what the continued fractions for ^h 
look like, we can immediately describe how to find a solution for Pell's 
equation x 2 - ny z = 1 . 
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Theorem 14.18. Let n be a nonsquare positive integer and consider Pell's 
equation 


x 2 - ny 2 = 1. 

Represent Jn as a continued fraction 

yfn = f< 7 1 , qz, qz- ■ ■ ■ - qk - 2q x ], 

and write the convergents as g, g, g, .... 

Ifk is even, then x = a k , y = b k is a solution to x 2 - ny 2 = 1; ifk is odd, 
then x = a 2k , y = bzk is a solution to x 2 — ny 2 = 1. 

Moreover, ifm is a positive integer, then x = a mk , y = b mk is a solution ifk 
is even, and x = a 2 ,„k ■ y = b 2 mk is a solution ifk is odd. 

Proof 

Recall that q k + -Jh = [ 2q k , q 2 , qz ^],sowe have 

Vn=[qi, q 2 , qz qk- 2q { ] = [qi,q 2 q k . 2q u qz- q 3 - ■ ■ ■ , q k ] 

= [qi, qz, ■ ■ ■ . qk- qi + s/n]. 

It follows, as we have seen before, that 

^Jyi — + \fh ) a k + ak - 1 

~~ (<?i + -Jh ) b k + b k _\ ' 


This we rewrite as 

nb k + ( q\b k + b k -\)Vn = ( q k a k + a*_i) + a ky fh. 

Then, equating the rational and the irrational parts of this expression, 
we get 


nb k = q\cik + a k ~ i and q x b k + b/c-i = a k . 

Solving for a k -\ and b k ~\ , we end up with 

rt/t-i = nb k - q x a k and b k _ x = a k - q x b k . 


Thus, by Theorem 14.3, 


a k (a k - q\b k ) - b k (nb k - q x a k ) = (-1)\ 
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and so 

a k - nb k = (-!)*• 


Therefore, if k is even, x = a k . y — b k is a solution to Pell’s equation 
x 2 - ny 2 = 1 . 

On the other hand, if k is odd, then we can repeat the argument with 
2k replacing k, and we conclude that 

a 2 2 k -nb 2 2k = (-\) 2k . 

Hence x = a 2k . y = b 2k is a solution to Pell’s equation x 2 - ny 2 = 1 . 

Finally, for a positive integer m, we repeat the argument with mk 
replacing k if k is even, and 2 mk replacing k if k is odd. 

This completes the proof of the theorem. ■ 

Let’s do two examples, one where k is even, and one where k is odd. 
First let's solve the Pell equation 

x 2 - 23y 2 = 1. 


Since </23 = [4, 1, 3, 1, 8 1, we have k — 4. Computing the convergents, 
we get |, y , ^ , . . . ; therefore, x = 24, y = 5 is a solution, which we 

check by computing 24 2 - 23 ■ 5 2 = 576 - 575 = 1 . 

We can find another solution by computing the next four conver- 
gents, which yields > t herefore > * = 

1151, y = 240 is also a solution, which we check by computing 1151 2 - 
23 • 240 2 = 1 324 801 - 1 324 800 = 1 . We could continue in this way, 
finding solutions as long as we like since for any positive integer m the 
convergent ^ provides a solution. 

Now let’s solve the Pell equation 

x 2 - 13y 2 = 1 . 


Since \/l3 = [3. 1, 1, 1, 1, 6 ], we have k = 5. Computing the first ten 
convergents, we get f, ± f , f f|g, . ..; therefore, 

x = 649, y = 180 is a solution, which we check by computing 
649 2 - 13 • 180 2 = 421 201 - 421 200 = 1. Again, we could continue 


finding solutions as long as we like since for any positive integer m the 
convergent provides a solution. 

So Theorem 14.18 provides us with an elegant method for finding 
infinitely many solutions to a given Pell equation, in fact, although 
we haven’t proven it, Theorem 14.18 gives us all solutions. However, in 
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Problem 4.16 we presented a somewhat simpler method for finding all 
the solutions. The key to this method is that once you have found 
the smallest solution— for example, by using Theorem 14.18— then the 
other solutions follow using a simple formula. 

Theorem 14.19. Let n be a nonsquare positive integer, and let x\, vi be the 
smallest positive solution to Pell’s equation 

x 2 - ny 2 = 1 . 

Then all positive solutions to this equation are given by x k , y k for all k > 1, 
where 


Xk + y k Vn = (*i + yiVn) k . 

Proof 

We first prove the easy part of this theorem; namely, that this formula 
does indeed produce solutions to Pell's equation. In Problem 14.28, we 
see that the conjugate of a product of quadratic surds is the conjugate 
of the product. This means that since for each k we have defined 

Xk + ykVn = Oi +y\s/n) k , 

then we also have 

Xk ~ y k \fn = (*i - yi y/n) k - 
Therefore, for each k > 1, we have 


x l ~ ny 2 = (x k - y k Jh)(x k + y k sfh) = (jq - y k ^/n) k (*, + y l Jn) k 
= ((*1 - y\ yfn) {X k + yi s/n)^j = (x, 2 - ny\) k = (1)^ = 1, 

and so each x k , y k is a solution to Pell’s equation. 

The somewhat harder part of the proof is to show that these are all 
of the solutions. So suppose, by way of contradiction, that there is a 
positive solution a, b that is not one of the solutions produced by the 
theorem. Since x, , y k is the smallest positive solution, this means that 
there is a positive integer / such that 

(xi + yi sfn)' < a + b jn < (jq + yi sfn)' +1 . 

Now, since (jq - y\ s /h){x\ + yiy/h) = x 2 - ny 2 = 1, we have that 
Oi - yi = ( ^, + |. lv ^ ) / = (^i + . So, multiplying the above 

inequality by (jq - y\*Jn)i , we get 


1 < (a + bs/h){x i - vi Vn) ; < Aq + yi Vn. 
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But then, writing ( a + b^/n)(x\ - vi y/rt)> as c + djn , we see that 
c 2 - nd 2 — (c — d-Jn )(c + d-/n ) 

= (a- bs/n )(xi + yi^/n)i{a + bsfn )(X] - y\jh )' 

= ( a 2 - nb 2 )(x 2 - ny\)i = 1. 

Therefore, c, d is a smaller solution than the smallest solution X\, vi . 

However, to complete the contradiction we still need to show that 
c and d are positive. This is quite straightforward. Since c + d*Jn > 1, 
0 < (c + dv/h) -1 < 1, and since (c + = c - dsjn, we have 

0 < c - d^fn < 1. But then c — c+ ^ v ” + c ~ d 2 ^ > \ + 0, so c > 0. We 
also have dsfn = c+ ^ - c ~ d ^ h > \ - \ — 0, so d > 0. 

This completes the proof of the theorem. ■ 

To illustrate the power of this theorem, let us revisit the problem of 
solving the Pell equation 

x 2 — 23 v 2 = 1. 

We saw that Theorem 14.18 produced the smallest positive solution 
x = 24, y = 5. Therefore, by Theorem 14.19, all solutions are produced 

by (24 4- SV23) k for k = 1,2,3 For example, (24 + 5s/23) 2 = 

1151 + 240\/23 produces the solution x = 1151, y = 240, which is the 
same solution we found above using continued fractions. In fact, with a 
calculator, we can almost instantly produce more solutions: 

k = 3: x = 55 224, y = 11 515; 

k = 4: x = 2 649 601 . y = 552 480; 

k = 5: x = 127 125 624. y = 26 507 525; 

k = 6: x = 6 099 380 351, y = 1 271 808 720; 

In principle, we could go on and on forever. 

Problems 


Finite Continued Fractions 

14.1 ★ (H,S) Illustrate the “obvious” fact that any finite simple continued 
fraction is a rational number by expressing the continued fraction 
[1,2, 3, 4, 5] as a rational number. 
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14.2 * (S) Illustrate Theorem 14.1 by using the Euclidean algorithm to express 

the rational number as a finite continued fraction. Note that 
since 0 < ^oo < 1 , the first partial quotient will be q\ =0. 

14.3 In the definition of the simple continued fraction [qi, q 2 , ■ ■ ■ , q„] we 
require that all the partial quotients q 2 , . . . , q n be positive integers, 
but we do not require the integer q\ to be positive. Why not? 

14.4 We saw that the continued fraction [2, 2. 2] has an alternative 
representation [2, 2, 1, 1|. Find an alternative representation for the 
continued fraction [1, 2, 3, 4, 5], 

14.5 We mentioned that each rational number has two distinct 
representations as a finite simple continued fraction. In fact, one 
representation has an even number of terms and one representation 
has an odd number of terms. 

Let [q\, q 2 . • . . , q n \ be a simple continued fraction with at least two 
terms. 

(a) Show that if q n — 1, then there is an alternative representation 
with one fewer terms; 

(b) Show that if q n > 1 , then there is an alternative representation 
with one more term. 

Now, let [q\] be a continued fraction with just one term. 

(c) Show that there is an alternative representation with two terms. 

14.6 (H,S) Suppose that the continued fraction [qi, q 2 q n ] represents a 

number x. Find a continued fraction representation for the 
reciprocal 

14.7 (H,S) Use Theorem 14.2 to find the convergents for the continued fraction 

[1, 2, 3, 4, 5], You may want to use a table format. 

14.8 ★ Use Theorem 14.2 to find the convergents for the continued fraction 

[1, 1. 1, 1. 1, 1, 1]. You may want to use a table format. What do you 
notice about these convergents? 

14.9 (S) The recursive procedure given in Theorem 14.2 for computing the 

convergents for a continued fraction [q\, q 2 , . . . , q„] can be restated 
very neatly using matrices: 

{ a k _ (q\ 1 \ (qi 1 \ fqk 1 \ 

\bkh-i) VI o) \1 0 \i oj ■ 
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Note that for k — 1 we are using the convention that a 0 = 1, b 0 = 0. 

(a) Use induction and Theorem 14.2 to verify this formula. 

(b) Use this formula to find the convergents for the continued 
fraction [1, 116, 1, 2, 2, 2] that represents the number 


14.10 * (S) In Chapter 12, we proved that the Fibonacci numbers satisfied the 
identity 


Fn — Fn+lFn-l + (— 1 )" +1 . 


Use Theorem 14.3 and Problem 14.8 to prove this identity. 


14.11 * (H,S) Use Theorem 14.3 to find a solution for the Diophantine equation 
900* + 628 y = 400. 


14.12 (H,S) Use the continued fraction [1, 2, 3, 4, 5] to illustrate Theorem 
14.4 — that is, show that if we let x — [1, 2, 3, 4, 5], then 


C\ < C3 < C5 = * < C4 < Cz- 


14.13 (S) Christiaan Huygens, a friend of Fermat whom we first met on page 1, 

used continued fractions to help design a planetarium. At the time it 
was believed that it took Saturn 29.43 years to make one complete 
revolution around the sun. Huygens, therefore, wished to use two 
gears, one with a teeth, and one with b teeth, such that | % 29.43, 
and the integers a and b needed to be a reasonable size for the 
manufacture of gears with that many teeth. The principle here is 
much like that on a bicycle when you use your derailleur to shift 
gears, thereby changing the ratio between the rates at which the 
pedals and the wheels are rotating. 

By expressing 29.43 as a continued fraction and computing its 
convergents, find two gears that would provide a good approximation 
to Saturn’s rotational speed as compared with that of the Earth. 

14.14 (H,S) It takes the Earth 365 days, 5 hours, 48 minutes, and 46 seconds to 

make one complete revolution around the sun. 

(a) Verify that one year is equal to 365 + ^ days. 

Since = ^ ^ « i, in 45 B.C., Julius Caesar added a leap year day to 
the month of February every four years. His calendar became known 
as the Julian calendar. 

(b) Although the approximation of ^ by 5 seems quite close, 
the Julian calendar would necessarily slowly drift in a way that 
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would eventually become noticeable, perhaps, for example, 
because spring would seem to be coming much earlier than it 
used to. Compute by how many days the Julian calendar is off 
after sixteen hundred years. 

(c) By 1582, the Julian calendar had gotten significantly out of 
whack. So Pope Gregory XIII, whose main concern was that 
Easter be celebrated at the appropriate time of year, introduced 
what is now called the Gregorian calendar, which skips the 
extra leap year day (February 29) during any year divisible by 
100 but not by 400. Thus the Gregorian calendar adds February 
29 to any year that is divisible by 4, except for any year that is 
divisible by 100 but not by 400. Thus the year 2000 had a 
February 29, but 2100, 2200, and 2300 will not. This is the 
calendar we currently use. 

How many days does the Gregorian calendar add every 400 
years? Show that this is a better approximation to astronomical 
reality than that provided by the Julian calendar. 

(d) Each year the Julian calendar is off by \ - .0078 days 

per year, whereas the Gregorian calendar is off by only 

Wo ~ iHli % -0003 days per year. So now it takes 3323 years for 
our calendar to be off by a single day; under the Julian calendar 
this happened every 128 years. 

In Problem 14.2, we expressed the rational number ^ ^ as 
the continued fraction [0, 4, 7, 1, 3, 5, 64], Find the 
convergents for this continued fraction and use them to design 
a new calendar that would be an “improvement” to the 
Gregorian calendar. (The word “improvement” is in quotes here 
because while the whole point of this problem is to find a better 
approximation for ^ than any such improvement may 
come at the expense of having a calendar that is far less 
convenient to use than the Gregorian calendar.) 


Infinite Continued Fractions 

14.15 * (H) In Problem 3.34, we expanded V3 into the infinite repeating 
continued fraction 


1 + 


1 + 
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Use Theorem 14.6 to verify that the continued fraction [1 , 1 , 2 ] does 
indeed represent >/3. 


14.16 (H) We mentioned at the beginning of this chapter that Bombelli used 

continued fractions in 1572 to represent square roots. For example, he 
represented vT3 as 


n/13 = 3 + 


4 


4 

6 + - 

4 



This suggests he was aware of the general formula 


\j a 2 + b — 


a + 


b 


2u 4~ 


2 a + 


2u + 


a formula that allows us to represent any square root as a continued 
fraction where a > 0 and b > 0. 

Assuming that the continued fraction 

b 

b 

2 a ~F 

b 


converges to a real number x, verify the above general formula by 
showing that x = -a + V a 2 4- b. 


14.17 (H,S) We discovered that the repeating continued fraction [1. 2, 3, 4, 5 ] 
is equal to 103+ g ^ 1297 . Verify that 103+ 9 '^ 2 ^ is indeed a quadratic surd by 
finding a quadratic equation ax 2 + bx + c = 0 with integer coefficients 
for which it is a solution. 


14.18 (S) Perhaps it has occurred to you that it is also possible to express a real 
number x as an ascending continued fraction: 


x 


1 + 


1 + 


1 + • • • 
«3 


«2 


a \ 
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This representation for a positive real number is equivalent to what is 
called the Engel expansion 

1 1 1 

X — 1 1 b • ■ • , 

U\ U\U2 

where cq, « 2 , • • ■ , are positive integers such that cq < a 2 < . . . . 

Fibonacci, in 1202, used the notation in his book Liber abaci to 
stand for the ascending continued fraction 


1 + 

4 

3 


(a) Show that Fibonacci’s ascending continued fraction 



1 + 

4 

3 

and the corresponding Engel expansion 

1 1 1 

3 ^ 3 - 4 ^ 3 - 4-5 

represent the same rational number, and find that number. 

(b) Find the first six terms a\, a 2 , .... a 6 for an ascending 
continued fraction representation of the irrational number 
e % 2.71828. Use here what is called a greedy algorithm by 
making each fraction in the Engel expansion as large as possible 
by always choosing each a, to be as small as possible. 

(c) Use your result from part (b) to make a conjecture as to what the 
infinite ascending continued fraction representation of the 
irrational number e actually is, and then use a familiar infinite 
series representing e to verify your conjecture. 


Approximation 

14.19 * (H) (a) Verify that the repeating continued fraction [2, 1, T7 1, 4] 
represents the irrational number sfl . 
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(b) Use this continued fraction and Theorem 14.9 to find a rational 
approximation for s/7 that is accurate to four decimal places. 

Do not use a calculator while doing this problem! 

14.20 You might have been somewhat annoyed in Problem 14.19 that you 
had to find the ninth convergent in order to approximate s/7 to four 
decimal places, whereas we could do that in the text for V5 with only 
the fourth convergent. In other words, the continued fraction for s/7 
converges more slowly than does the continued fraction for s/5. What 
infinite simple continued fraction converges slower than any other? 


14.21 Consider the irrational number y = [3,4,5]. Now, we could fairly 
easily express y as a quadratic surd as we did in the text. But instead, in 
order to emphasize the important notion of being able to 
approximate any irrational number by a rational number using 
continued fractions, use Theorem 14.9 to approximate y accurately to 
five decimal places. 

14.22 Again, in order to emphasize the important notion of being able to 
approximate any irrational number by a rational number using 
continued fractions, use Theorem 14.9 to approximate the irrational 
number [1, 2, 3, 4, 5, 6, ... ] accurately to six decimal places. 

14.23 ★ (S) Euler discovered an extraordinary beautiful infinite simple 

continued fraction representation for the number e: 

e = [2, 1, 2, 1, 1, 4, 1, 1, 6 , 1, 1, 8 , ... ]. 

Use this continued fraction and Theorem 14.9 to express e accurately 
to six decimal places. 

14.24 (H) Theorem 14.9 provides us with a good measure of how accurately a 

given convergent approximates an irrational number x; in particular, 
for each convergent g, we are guaranteed that 

a k 1 

X — — < 

h b k b k+1 ' 

Show that, on the other hand, it is also true that for each 
convergent g we have 


x 


Ok 

h 


i 


> 


2b k b k+ i ' 
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Pell’s Equation 

14.25 We showed that the repeating continued fraction [1, 2, 3, 4, 5 ] 
represents the quadratic surd 103 W 1297 . Verify that the conjugate of 
this quadratic surd does not lie between -1 and 0. 

14.26 (S) We showed in the text that for each nonsquare positive integer n there 

are only a finite number of quadratic surds that satisfy 
conditions (1) and (3) of Theorem 14.15. 

(a) Verify that if m and d are positive integers that satisfy 

0 < m < sfn and s/n - m < d < sfn + m, 

then the quadratic surd satisfies conditions (1) and (3) of 
Theorem 14.15. 

(b) Find all the quadratic surds of the form m+ that satisfy 
conditions (1) and (3) of Theorem 14.15. 

(c) For a given nonsquare positive integer n, count how many 
quadratic surds there are that satisfy conditions (1) and (3) 
of Theorem 14.15. 

14.27 (S) Since the conjugate of the quadratic surd does not lie between 

-1 and 0, we know by Theorem 14.15 that its continued fraction is not 
purely periodic. However, we know by Lagrange’s theorem, Theorem 
14.8, that n+ X 2 can be represented by a repeating simple continued 
fraction of the form x = [t/i , q 2 q m , p\,p 2 pn ] • 

Compute the continued fraction for n +:p and, in particular, show 
that as you compute the partial quotients for this continued fraction 
you at some point reach a quadratic surd that satisfies the hypothesis 
of Theorem 14.16, and so the remainder of the continued fraction 
becomes purely periodic. 

14.28 (H) We have been writing quadratic surds in the form where m, n 

and d are integers and n is not a square. In this problem it will be 
somewhat easier to write quadratic surds in the form a + b^Jn where a 
and b are rational. Then, for a quadratic surd x = a + b^/n, we define 
the conjugate of x to be x! = a - bjn. 

Verify that the operation of conjugation respects the arithmetic 
operations of addition, subtraction, multiplication, and division; 
that is, prove that if x = a + bjn and v = r + s«Jn are two quadratic 
surds, then 

(a) (x + >’)' = x' + y', and (x - >•)' = x' - /; 

(b) {xy)' = x'y', and (|)' = 
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14.29 ★ We computed the first eight convergents for the continued fraction 

V23 = [4, 1, 3, 1. 8 ] in order to find solutions for the Pell equation 

x 2 — 23y 2 = 1, 

and we found that the convergents ^ ^ and ^ produced 

two solutions to this equation. 

Extend this sequence of convergents 

4 5 19 24 211 235 916 1151 
I’ P ~4’ T’ 44 ’ ~4p' 191’ 240 ’ 

by another four terms, and verify that the solution produced by the 
convergent ^ is the same solution we found by using Theorem 14.19 
and computing (24 + S^23) 3 . 

14.30 (S) Consider the Pell equation 

x 2 - 19y 2 = 1. 

(a) The continued fraction for /l9 is [4, 2, 1,3, 1, 2, 8 ]. Use 
Theorem 14.18 to find the smallest positive solution to 

x 2 — 19y 2 = 1 . 

(b) Use Theorem 14.19 to find three more solutions. 

14.31 Consider the Pell equation 

x 2 - 29/ = 1. 

(a) The continued fraction for >/29 is [5, 2, 1, 1, 2, 10 ]. Use 
Theorem 14.18 to find the smallest positive solution to 
x 2 — 29y 2 = 1. 

(b) Use Theorem 14.19 to find two more solutions. 


15 

Partitions 


In 1674, Gottfried Leibniz wrote a letter to Johann Bernoulli asking 
whether he had ever thought about the number of “divulsions” of in- 
tegers. What Leibniz called a divulsion, we now call a partition. Leibniz 
gave the following examples: 

for n = 3, there are three partitions: 3, 2+1, 1 + 1 + 1; 

for n = 4, there are five partitions: 4, 3 + 1, 2 + 2, 2+1 + 1, 

1 + 1 + 1 + 1 ; 

for n = 5, there are seven partitions: 5, 4 + 1, 3 + 2, 3 + 1 + 1, 

2 + 2 + 1, 2 + 1 + 1 + 1, l + l + l + l + l; 

for n = 6, there are eleven partitions: 6, 5 + 1, 4 + 2, 4 + 1 + 1, 

3 + 3, 3 + 2+1, 3 + 1 + 1 + 1, 2 + 2 + 2, 2 + 2+1 + 1, 

2+1 + 1 + 1 + 1, l + l + l + l + l + l; 

for n — 7, there are fifteen partitions: 7, 6 + 1, 5 + 2, 5 + 1 + 1, 

4 + 3, 4 + 2+1, 4 + 1 + 1 + 1, 3 + 3 + 1, 3 + 2 + 2, 

3 + 2+1 + 1, 3 + 1 + 1 + 1 + 1, 2 + 2 + 2+1, 2 + 2 + 1 + 1 + 1, 

2 + 1 + 1 +1 + 1 + 1, 1 + 1 + l + l + l + l + l. 

1 hus a partition of a positive integer n is an expression representing 
n as a sum of positive integers. As the examples given by Leibniz make 
clear, we do not worry about the order of the numbers in the sum. So the 
three expressions 3 + 1 + 1 , 1 + 3 + 1 , and 1 + 1 + 3 are just three different 
ways to write the same partition of 5. We normally prefer to write the 
numbers in a partition in descending order, as in the examples above. 

Since our primary concern in this chapter will be to ask how many 
ways a given integer can be written as a sum of positive integers, we 
define p{n) to be the numbers of partitions of n, where n is a positive 
integer. Here are the first few values of this function: 


PCI) = 1. p( 2) = 2, p( 3) = 3, p( 4) = 5, p( 5) = 7, p( 6) = 11, p{7) = 15. 

We also find it useful to define p( 0) = 1 . (We do this for much the same 
reason that we define 0! = 1 so that the equation ('') = makes 

sense even in the cases when r = 0 and r = n.) 


434 


Chapter 15 


Euler 




In 1740, a mathematician in Berlin, Philipp Naude, wrote a letter to 
Euler asking about partitioning a positive integer n into m parts. Specif- 
ically, he asked how many ways can you partition 50 into seven distinct 
parts? This is not an easy question, but Euler came up with a clever way 
to approach it. 

Here is an example to illustrate his basic idea. Let n = 10 and m = 3, 
and ask how many partitions are there of 10 into three distinct parts? 
Since 10 and 3 are small, it is possible to simply list all such partitions: 
if the smallest of the three numbers is 1, then we need to write 9 as the 
sum of two distinct numbers neither of which is 1, and we can do this 
only as 7 + 2, 6 + 3, and 5 + 4; on the other hand, if the smallest of the 
three numbers is 2, then we need to write 8 as the sum of two distinct 
numbers neither of which is 1 or 2, and we can do this only as 5 + 3; 
finally, if the smallest of the three numbers is a > 3, then we would 
need to write 10 - a as a sum of two distinct numbers each of which 
is at least 4, which is impossible since 10 — a < 7. Thus there are four 
partitions of 10 into three distinct parts: 

7 + 2+1, 6 + 3 + 1. 5 + 4+1, 5 + 3 + 2. 

Euler’s clever idea was to create an infinite product 

(1 + zy)(l + zy 2 )(l + z y 3 ) ( 1 + zy 4 )(l + zy 5 ) • • ■ , 

which can be expanded, just like any polynomial, yielding 

1 + zy + zy 2 + zy 3 + z 2 v 3 + zy 4 + z z y 4 + zy 5 + 2 zV 5 + ■ ■ • . 

It is easy to see where each of the first few terms in this expansion 
comes from. For example, the term z 2 y 3 = (zy)(zy 2 ), and this is the only 
way to form z 2 y 3 in this infinite product. Somewhat more accurately, in 
this infinite product z 2 y 3 is the infinite product zy ■ zy 2 ■ 1-1-1 ■ - , 
but no purpose is served by writing all the Is. What about the term 
2 z 2 y 5 ? Where does it come from? Well, there are two ways to achieve 
a product z 2 v 5 : either from (zy)(zy 4 ) or from (zy 2 )(zy 3 ); and we get 
(zy)(zv 4 ) + (zy 2 )(zy 3 ) = 2z 2 y s . 

So, what would the coefficient of z 3 y 10 be in this expansion? Since we 
have to put together a product of three distinct terms, zy', zy’ , and zy k 
such that i + / + k = 10, this amounts to finding all the ways to have 
three distinct positive integers i, j, k such that i + / + k = 10, and we 
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already know how to do this: 

7 + 2 + 1, 6 + 3+1, 5 + 4+1, 5 + 3 + 2. 

This means that in the expansion the coefficient of the term z 3 y 10 is 4; 
and so, in the expansion this term appears as 4z 3 y 10 . 

In other words, what this example makes clear is that in the expan- 
sion of Euler’s product (1 + zy)( 1 + zy 2 )( 1 + zy 3 ) • • • , the coefficient 
of the z"'y n term is the number of partitions of the integer n into m 
distinct parts. However, there is still work to do to make Euler’s idea 
useful. Imagine the work involved to answer Naude’s original question. 
You would have to expand Euler’s product all the way out to the z 7 y 50 
term. (Of course, these days we could use, say, Sage and do this rather 
quickly, but for the moment we are back with Euler in the middle of the 
eighteenth century.) 

So, let us now define P(m , n) to be the number of partitions of an 
integer n into a sum of m distinct positive integers. Then, as we have just 
seen, 


Y] P(m , n)z m y n = (1 + zv)(l + zy 2 )(l + zy 3 ) • • • = JJ(1 + zy'). 
m,n> 0 i= l 

Note that we have to define P(0, 0) = 1 so that the first term on the left 
is 1 to correspond to the first term of the infinite product, which is 1. 

We can make a few more observations about particular values of 
P(m, n). For any n > 1 , P( 0, ri) = 0 since n cannot be a sum of 0 positive 
integers; similarly, for any n > 1, P(l, n) — 1 since there is only one way 
to represent n as a sum of one positive integer; namely, as n = n. It is also 
clear that for m > 1, P(m, 0) = 0 since 0 cannot be expressed as a sum of 
positive integers. 

Thus we have the barest beginning for a table of values for P{m, ri): 


m\n 

0 1 2 3 4 5 • • • 

0 

1 0 0 0 0 0--- 

1 

011111... 

2 

0 

3 

0 


In the next theorem we will present a recursive method for filling in the 
rest of this table, or more accurately filling in as much of the table as we 
need in order to find a particular value of P(m. ri). 
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However, before we do that, let's make one more observation that 
fills in a large portion of this table. For any given positive value of m 
it is clear that the smallest value of n that can be written as a sum of m 
distinct positive integers isl+2 + 3+-+ m, that is, the triangular 
number t m ; and, of course, there is only one way to do this. In the row 
corresponding to m, there are f,„ zeros followed by 1 . Hence we know 
that, so far, the table looks like this: 


m\n 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 • • • 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 • • • 

1 

0 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 • ■ • 

2 

0 

0 

0 

1 








3 

0 

0 

0 

0 

0 

0 

1 





4 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 


In order to efficiently fill in more of the table, we need the following 
theorem. 

Theorem 15.1. For m > 1 and n > m, we have the following recursion 
formula: 

P{m , n) — P(m , n - m) + P(m - 1, n - m). 

Proof 

We can rewrite Euler’s infinite product as 

OO 

na + z y'"> = (! + z .v)(l + Z .V 2 )( 1 + Zy 3 )( 1 + Zy 4 )( 1 + ZV 5 ) • • • 
i=i 

= a + z >’) • ((1 + (zy)y) (1 + (zy)v 2 ) (1 + (zy)y 3 ) (1 + (zy)y 4 ) • • ■ ) 

OO 

= (l+zy)HU +(zy)y i ). 
i=i 

In the original product the term z m y n occurs P{m , n) times. On the other 
hand, in the rewritten product it occurs in two different ways: first, it 
happens when multiplied by the 1 in the term 1 + zy, and this happens 
P{m , n - m) times (because (zy) m y n ~ m = z m y n ); second, this happens 
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when zv in the term (1 + zy) is multiplied by z m ~ 1 y n ~ 1 , and this happens 
P(m - 1, n — m) times (because (zv)"'~ 1 y n_m = z"'~ x y n ~ x ). 

Therefore, 


P(m, n) — P{m , n - m) + P(m — 1, n - m). 

This completes the proof of the theorem. ■ 

Now, using this theorem we can fill in more values in the table for 
P(rn, n). Begin with P( 2, 4) = P( 2, 2) + P( 1, 2) = 0 + 1 = 1; then 
P( 2, 5) = P(2, 3) + P( 1, 3) = 1 + 1 = 2; P{ 2, 6) = P( 2, 4) + P(l, 4) = 
1 + 1 = 2. Once you get the hang of it, this can go very quickly, since at 
any point to get the next number in a given row you move one column 
to the right from where you did your previous calculation and add the 
number in the given row in that column and the number above it. Here 
is the completed table for 0 < m < 4 and 0 < n < 10: 


m\n 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 • ■ • 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 • • • 

1 

0 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 ■ ■ • 

2 

0 

0 

0 

1 

1 

2 

2 

3 

3 

4 

4 ■ • • 

3 

0 

0 

0 

0 

0 

0 

1 

1 

2 

3 

4 • • • 

4 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 • • • 


Note that our last calculation in the above table produced P(3, 10), 
which coincides with our earlier conclusion that there are four parti- 
tions of 10 into three distinct parts: 

7 + 2 + 1, 6 + 3 + 1, 5 + 4+1, 5 + 3 + 2. 


Generating Functions 

The infinite product 

Y P(m, ri)z m y n = (1 + Zv)(l + zv 2 )(l + zv 3 ) • • • 

tn,n> 0 


that Euler produced to study the number of partitions of an integer n 
into m distinct parts is a good example of a generating function because 
the coefficients of the individual terms of the expansion represent 
exactly what we are looking for. In other words, for a given value of 
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m and n the number of partitions of n into m distinct parts— that is, 
P(m , «)— is literally generated by Euler’s product as the coefficient of 

z m y n . 

We introduced the idea of generating functions in Chapter 12 
where we let the generating function for the Fibonacci sequence 
Fo, F\ , F 2 , . . . , be given by 

f(x) = F 0 + Fix + F 2 x 2 + F 3 x 3 + F 4 x 4 + ■ • • , 

and we then showed that 


fix) = 


x 

1 - x - x 2 


and so j_£_ x i is the generating function for the Fibonacci sequence. 
We then used this generating function to prove Binet's formula 
for the Fibonacci numbers. We also used generating functions in 
Problem 12.38 to prove the following identity involving Fibonacci 
numbers: 

Fo + F\ + F 2 + F 3 + • • • + F„ — F n+ 2 - 1. 

Not surprisingly, in addition to his generating function for P{m. n), 
Euler was also able to find a generating function for p(n), the number 
of partitions of n. Consider the following infinite product, in which we 
have intentionally written each factor only as far out as the x 6 term, 

(1 + X 1 + V 1 + 1 + x 1 + 1 + 1 + x 1 + 1 + 1 + 1 -f ^l+l+l+l+l + x l+l+l+l+l+l + . . . ) 

X (1 + v z + ^ 2+2 + x 2+2+2 + ■ ■ ■) 

X (1 + x 3 + x 3+3 + • • • ) 

X (1 +x 4 + ■ ■ ■ ) 

X (1 + X s + • • • ) 

X (1 + x 6 + ■ • • ) 
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Expanding this infinite product, we get 

1 + x + 2x 2 + 3x 3 + 5x 4 + 7 X s + tlx 6 + • • • . 

Let's see where each of these coefficients comes from. The first term, 1, 
is obvious. Similarly, there is only one x (in the first parenthesis). Next, 
in the expansion we can get x 2 in two ways: by choosing x 1+1 from 
the first parenthesis or by choosing x 2 from the second parenthesis; 
thus x 1+1 + x 2 = 2x 2 . Continuing, we can get x 3 in three ways: by 
choosing x 1+1+1 from the first parenthesis, or by choosing x 1 from the 
first parenthesis and x 2 from the second parenthesis to get x 4 x 2 = x 3 , or 
by choosing x 3 from the third parenthesis; thus 

x 1+1+1 +x 1 x 2 + x 3 = 3x 3 . 

Let’s do one more term. The coefficient for the x 4 term comes from 

x 1+1+1+1 + x 1+1 x 2 + x 1 x 3 + x 2+2 + x 4 = 5x 4 . 

Here, we see that the exponents on the left are nothing more than the 
five partitions of4:l + l + l + l, 1 + 1+2, 1 + 3, 2 + 2, and 4. By now, it 
should be clear that Euler's infinite product, when expanded, becomes 

p( 0) + p( l)x + p( 2)x 2 + p(3)x 3 + p( 4)x 4 + p( 5)x s + p(6)x 6 + • • • . 

In other words, his infinite product is the generating function for p{n). 

Now that we see how this is working, we will write Euler’s infinite 
product more simply as 


(1 + x 1 + x 2 + X 3 + x 4 + • • • ) X (1 + x 2 + x 4 + x 6 + x 8 ■ • ■ ) 

X (1 + x 3 + x 6 + x 9 + x 12 • • • ) X (1 + x 4 + x 8 + x 12 + X 16 • • • ) 

X (1 + x 5 + x 10 + x 15 + x 20 • • • ) X (1 + x 6 + x 12 + x 18 + X 24 • • • ) 


and even more compactly as 


[\(l + x n + x 2n + x 3n + x 4n + ■ ■ •)• 
11 = 1 
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Since 1 + x" + x 2n -f- x 3n + x 4n + ■ • • = j—-, for each n, we can summarize 
everything to this point by writing Euler’s generating function for 
p{n) as 

OO OO 

e pm*" = n , 

n=0 n = 1 X 

However, we still need an effective way of computing values of p(ri), 
perhaps something analogous to Theorem 15.1, which allowed us to 
compute values of P(m, n ) so easily. For this, we again look to Euler for 
inspiration. 


Euler's Pentagonal Number Theorem 

In Problem 2.12, we mentioned that, in addition to square and triangu- 
lar numbers, Nichomachus discussed other numbers having geometric 
qualities. In particular, he studied the pentagonal numbers pk 

1, 5, 12, 22, 35, 51, 70, ... 

that satisfy the formula pk = k(3k 2 ' ] . 

If we also consider a similar sequence of numbers 

2, 7, 15, 26, 40, 57, 77, ... 

that satisfy the formula qk — k{3k + 1) , we can join these two sequences to 
get a sequence of numbers 

1, 2, 5, 7, 12. 15, 22, 26, 35, 40, 51, 57, 70. 77, ... 

that are called the Euler pentagonal numbers. 

In 1751, Euler published a memoir in which he described a beautiful 
formula he had discovered, but for which he had no proof and could 
offer only what he called “a long induction,” by which he meant that 
he had laboriously verified his formula for a very large number of terms. 
Here is a slightly edited passage in which he describes this discovery and 
makes reference to a formula for the Euler pentagonal numbers: 

In considering the partitions of numbers, I examined, a long time 
ago, the expression 


(1 - *)(1 - * 2 )(1 - x 3 )(l - * 4 )(1 - * 5 )(1 - x 6 )(l - * 7 )(1 - x 8 ) • • • 
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in which the product is assumed to be infinite. In order to see 
what kind of series will result, I multiplied actually a great 
number of factors and found 

1 - x 1 - x 2 + X s - x 12 - x 15 + x 22 + x 26 - x 35 - x 40 + ■ ■ ■ . 

The exponents of x are the same which enter into the above 
formula; also the signs + and - arise twice in succession. It 
suffices to undertake this multiplication and to continue it as far 
as is deemed proper to become convinced of the truth of the 
series. Yet I have no other evidence for this, except a long 
induction which I have carried out so far that I cannot in any way 
doubt the law governing the formation of these terms and their 
exponents. I have long searched in vain for a rigorous 
demonstration of the equation between the series and the above 
infinite product (1 - x)(l - x 2 )(l - x 3 ) • • • , and I proposed the 
same question to some of my friends with whose abilities in these 
matters I am familiar, but all have agreed with me on the truth of 
this transformation of the product into a series, without being 
able to unearth any clue of a demonstration. 

Before we present a proof of this formula, which we now call Euler’s 
pentagonal number theorem, 


[J(l - x") = 1 - X - x 2 + x 5 + x 7 - x 12 - x 15 + x 22 + x 26 - • ■ • , 

n= 1 


let us do a small portion of the multiplication Euler himself did in order 
to “become convinced of the truth of the series." 

We begin with (1— x)(l-x 2 ) = l-x-x 2 +x 3 , which we thenmultiply 
by (1 - x 3 ), yielding 1 - x - x 2 + x 4 + x s - x 6 . We continue multiplying, 
in order, by (1 - x 4 ), (1 - x s ) (f - x 8 ), thus obtaining 

(1 - x)(l - x 2 )(l - x 3 )(l - x 4 )(l - x 5 )(l - x 6 )(l - x 7 )(l - x 8 ) 

= 1 - X - x 2 + X s + x 7 + x 9 + • • • + X 36 . 

Since any further multiplication involves multiplying by terms (1 - x 9 ) 
and higher, we know that the first five terms in this series are fixed 
forever, and we can also see that the x 9 term will drop out at the very 
next step; that is, we know that this series begins, just as Euler said, with 
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the terms 

l - X - X 2 + X s + X 7 . 

Obviously, Euler did many more calculations than we have done here. 
In Problem 15.6, you will be asked to verify one or two more terms in 
this series. 

Now, let’s see why the x 8 term disappears in the calculation above, 
whereas the x 1 term does not, and in fact remains with a plus sign. 
This will give us the key for finding a proof of Euler’s result. In the 
multiplication above — carried out throughout the (1 - x 8 ) term— an x 8 
was produced six times; here are all six (where multiplications by 1 have 
all been suppressed): 

(-x)(-x 2 )(-x 5 ), (-x)(-x 3 )(-x 4 ), (-X 8 ), 

(-x)(-x 7 ), (— x 2 )(— x 6 ), (-X 3 )(-X 5 ). 

The first three products all have an odd number of terms, and will each 
produce -x 8 . On the other hand, the last three products all have an even 
number of terms, and will each produce x 8 . Therefore, the sum of all six 
terms will be -3x 8 + 3x 8 = 0. Thus the reason the x 8 term disappears 
is that there are the same number of partitions of 8 (into distinct parts) 
with an odd number of terms as there are with an even number of terms. 
Let's see what happens with x 7 . Here, we produced x 7 five times: 

( x)( X 2 )( X 4 ), (-X 7 ), 

( — x)(— X 6 ), (— X 2 )(-X 5 ), ( — X 3 )(— X 4 ). 

So, since we have one more product with an even number of terms than 
products with an odd number of terms, we get (-2 + 3)x 7 = +x 7 , and 
this is why y 7 appears in the series with a plus sign. 

We are now ready to give the “rigorous demonstration of the equa- 
tion between the series and the above infinite product” for which Euler 
“searched in vain.” 

Theorem 15.2 (Euler's pentagonal number theorem). We have the 
following identity: 

00 

[J(l - x n ) = 1 - X - x 2 + x 5 + x 7 - X 12 - x 15 + x 22 + x 26 - ■ ■ • , 

n= 1 
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where the exponents in the series on the right are the Euler pentagonal 
numbers fork — 1, 2, 3 

Proof 

Our proof here will be very reminiscent of the proof of Theorem 12.5 
where we used the idea of a one-to-one correspondence between two 
sets. Let n > 1. We know from the examples above that if the number 
of partitions of n (into distinct parts) with an odd number of terms 
is the same as those with an even number of terms, then x n will not 
appear as a term in the series. Our goal in this proof is to set up a one- 
to-one correspondence between the set of even partitions of n and the 
set of odd partitions of n. Throughout this proof we will consider only 
partitions having distinct parts. 

First we will define a function that transforms even partitions into 
odd partitions; then we have to decide whether this function is a one- 
to-one correspondence between the two sets. Here is how this function 
works. For the moment, assume that k is an even positive integer and 
that n = n k + ■ • ■ + n\ is a partition of n. As usual, we write the partition 
in descending order, and so n k > n k -\ > ■ ■ ■ > n 2 > n k . Next, we let c 
be the number of terms beginning with n k that are consecutive integers. 
For example, for the partition 20 = 7 + 6 + 5 + 2, we have c = 3 because 
7, 6, 5 are consecutive. 

Now we will transform the even partition n = n k + ■ ■ ■ + n\ in one of 
two ways, depending on the relative sizes of ri\ and c. 

(1) If c < », , we reduce each of the first c terms by 1, and then add c as 
a new term at the end of the partition. The result is a partition of 
n with one more term; hence it is odd. Here is an example: 

34 = 12 + 11 + 7 + 4 and c = 2 < 4 = n\, so we transform it into 
34 = 11 + 10 + 7 + 4 + 2. 

(2) If c > /+ , we increase each of the first n\ terms by 1, and then drop 
«i from the partition. The result is a partition of n with one fewer 
term; hence it is odd. Here is an example: 34 = 9 + 8 + 7 + 5 + 

3 + 2 and c = 3 > 2 = n 1; so we transform it into 
34 = 10 + 9 + 7 + 5 + 3. 

The important thing to notice at this point is that this transforma- 
tion is completely reversible; that is, it is its own inverse, and therefore 
is a one-to-one function. Let’s see why this is true by looking at the two 
examples we just did. First, consider the transformed partition 34 = 
11 + 10 + 7 + 4 + 2; by the way in which it was constructed it is clear that 
for this partition n\ < c, so when we now transform it by rule (2) above, 
we get 34 = 12 + 11 + 7 + 4, which is what we started with. Similarly, 
when we consider the transformed partition 34 = 10 + 9 + 7 + 5 + 3, 


444 


Chapter 15 


it is again clear that by the way in which it was constructed that for this 
partition we now have c < n\ , So when we now transform it by rule (1) 
above, we get 34 = 9 + 8 + 7 + 5 + 3 + 2, which is what we started with. 

Because this transformation is its own inverse, we now remove our 
previous restriction that k is even; in other words, we are using the same 
transformation whether we are going from the set of even partitions to 
the set of odd partitions, or vice versa. 

Here comes the main idea of the proof. For some values of n, this 
transformation is truly a one-to-one correspondence between the two 
sets (hence the term x n disappears from the series), but for some other 
values of n the transformation is still a one-to-one function but it misses 
a partition in one of the sets (so it is not a one-to-one correspondence 
and the term x n appears in the series). 

The transformation can miss a partition in exactly two ways. Again, 
we will start with an example. Consider the partition 15 = 6 + 5 + 4; 
here c = 3 < 4 = Mi, so we reduce the first three terms by 1 and add 3 
at the end, but this gives us 15 = 5 + 4 + 3 + 3, which is not a partition 
with distinct terms. What went wrong in this case is that c = k and 
«i = k + 1. But, in general, if c = k and n\ = k + 1, the partition consists 
of k consecutive integers the smallest of which is k + 1 , and so 

m = (k + k) + ••• + (* + 2 ) + (k+ 1 ) = k 2 + (* + ■•■ + 2 + 1 ) 


2 *(*+1) 3 k 2 + k k(3k+l) 

~ l 2 ” 2 ~ 2 ’ 

and n is an Euler pentagonal number. 

For the second way the transformation can miss a partition, consider 
the partition 22 = 7 + 6 + 5 + 4; here c = 4 > 4 = Mi so we are 
supposed to increase each of the first four terms by 1 while at the same 
time dropping 4 from the partition, which is impossible. What went 
wrong in this case is that c = k = n t . But, in general, if c = k — Mi, 
the partition consists of k consecutive integers the smallest of which is 
k, and so 

m = (* + k — 1 ) + • • • + (* + 1 ) + k = k 2 + ((k - 1 ) + • • • + 2 + 1 ) 


j2 (k-l)k 3 k 2 -k k(3k — 1) 

+ ^ — “ 2 


and again n is an Euler pentagonal number. 
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Note that in each of these two cases, since k is the number of terms 
in the partition, x n will appear in the series as (— l) k x n where n = H . 
This explains why the plus signs and minus signs alternate in pairs in 
the series. 

This completes the proof of the theorem. ■ 

Perhaps Euler’s pentagonal number theorem — as interesting as it 
is— might strike you as just a curiosity. However, this theorem in fact 
provides us with just what we have been looking for: an efficient way to 
compute values of the partition function p(n). The recurrence relation 
for p(n) in the following formula was found by Euler, and even today 
there is no more efficient algorithm known for computing values of the 
partition function. 

Theorem 15.3. Let p(ri) be the number of partitions for each positive integer 
n, and we also define p( 0) = 1. Then pin) satisfies the following recurrence 
relation: 

p(n) = p(n-l) + p(n-2)-p{n-5)-p{n-7)+p(n-12)+p(n- 15 )-- • • , 

where, in the terms p(n-k), k represents successive Euler pentagonal numbers. 

The series on the right is finite and ends at p(n - k) where n-k is the least 
nonnegative integer with k being an Euler pentagonal number; in other words, 
the series continues as long as it makes sense. 

Proof 

At this point, we simply need to put a few of the pieces together. 
Recall that Euler found a generating function for p(n): 

OO OO , 

e pw x " = n 

n=0 n= 1 1 


We can rewrite this as 


Y / P(n)x"-l[(l-x") = l. 

n = 0 n= 1 


But Euler’s pentagonal number theorem tells us that 


[J(l - x") = \ - x - x 2 + X s + x 7 - x 12 - x 15 + x 22 + x 26 - ■■ ■ , 

n= 1 
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so we conclude that 

(1 + p(l)x + p{ 2)x 2 + p(3)x 3 + • • • ) ■ (\ - x - x 2 + x 5 + x 7 - ■ ■ ■ ) = 1. 

So, when we take the product on the left and compute the coefficient of 
a term x” where n>1we get 0. Therefore, for n > 1 , 

P(ti) ■ 1 + p(n— l)-(—l) + p(n-2) • (-1) + p(n-S) ■ 1 + p{n-7) • 1 + • • ■ = 0, 

and the recurrence relation for p(n) follows. This completes the proof of 
the theorem. ■ 

Let's do a few example to see how this recurrence relation works. 
Given that p( 0) = 1, assume we already know that 


Pi 1) = 1, Pi 2) = 2, p( 3) = 3, p( 4) = 5, p( 5) = 7. 


Then, by Theorem 15.3, we can compute 

P(6) = p(5) + p( 4) - p( 1) = 7 + 5 — 1 = 11 


and 


P(7) = Pi 6) + p(5) - p( 2) - p(0) = 11 + 7 - 2 - 1 = 15. 

In Problem 15.11 you will be asked to compute p{ 8), p{ 9), and p(10). It 
is not all that difficult to compute that p(100) = 190 569 791; and the 
value p(200) = 3 972 999 029 388 was known long before the age of 
computers. 


Ferrers Graphs 

An extremely useful way of representing partitions visually was in- 
troduced in the mid-1880s by America’s first great mathematician, 
J. J. Sylvester, who called the representations Ferrers graphs after 
N. M. Ferrers. Here is the Ferrers graph for the partition 17 = 5 + 4 + 
3 + 3 + 2: 
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Sylvester pointed out that instead of counting the dots in each row of 
a Ferrers graph, we could also count the dots in each column. Thus the 
above graph also represents the partition 17 = 5+5+4+2+1. Such pairs 
of partitions — arising from the same Ferrers graph — are called conjugate. 

Some partitions are conjugate only with themselves; for example, 



produces only the partition 13 = 5 + 3 + 3+1 + 1 because of its 
symmetry. Such partitions are called self-conjugate. The Ferrers graph of 
a self-conjugate partition of an integer n can be used to find a partition 
of n into distinct odd parts. For example, if we redraw the above graph 
using white and black dots 


• O O 

• o • 


we see a representation of the partition 13 = 9 + 3 + 1 where the parts 
are all odd and distinct. We could reverse this too; for example, we could 
represent the partition 26 = 11+9 + 5 + 1, whose parts are odd and 
distinct, in a symmetric Ferrers graph representing the self-conjugate 
partition 26 = 6 + 6 + 5 + 4 + 3 + 2. 

Thus we have the following result: 

The number of partitions of a positive integer n where the parts 
are all odd and distinct equals the number of self-conjugate 
partitions of n. 

This simple result can be used to verify an amazing identity found by 
Euler: 

(1 +x)(l +x 3 )(l +x s ) • • • 

XX 4 X 9 

1 + 1 - x 2 (1 — x 2 )(l - x 4 ) (1 — x 2 )(l - X 4 )(l - X 6 ) + 

We won't go into all the details of this verification, but will try to 
indicate the role that Ferrers graphs play. 
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Just as we saw that the function 


(1 + X + x 2 + • • • )(1 + x 2 + x 4 + • • • )(1 + X 3 + X 6 + ■ ■ ■ ) ■ ■ ■ 

is the generating function for p{n), the partitions of n, it can be shown 
that (l+x)(l+x 3 )(l+x s ) ■ • • is the generating function for the number 
of partitions n whose parts are odd and distinct. Therefore, Euler’s 
identity can be verified using the result above by showing that the 
right-hand side of the identity generates the number of self-conjugate 
partitions of n. 

To do this we look at Ferrers graphs for self-conjugate partitions in a 
new way. Once again we redraw the graph above highlighting the large 
square of black dots in the upper left: 

• • • O O 

• • • 

• • • 

o 

o 

We can do this for any self-conjugate partition of an integer n, where 
the black dots represent the largest m x m square in the upper left. The 
remaining white dots represent two identical partitions of the integer 

n—rrP 

2 ‘ 

Now, it turns out that the terms on the right-hand side of Euler’s 
identity— ^ 2)(1 _^J )(1 ^ m) — generate the number of partitions of 
whose parts are no larger than m, thus corresponding to the self- 
conjugate partitions of n having the form illustrated in the graph above. 
Therefore, the right-hand side of Euler's identity generates the number 
of self-conjugate partitions of n. Hence, by the result above, this verifies 
Euler's identity. 

The proof we gave of Euler’s pentagonal number theorem, 
Theorem 15.2, is essentially a proof that was given in 1881 by a 
student of Sylvester’s, Fabian Franklin. As a final example of the 
usefulness of Ferrers graphs, we will briefly describe Franklin’s proof. 

His idea was to use Ferrers graphs to find a one-to-one function 
between the even partitions having distinct parts and the odd partitions 
having distinct parts. Here is how it worked. To emphasize that our 
proof of Theorem 15.2 is really Franklin’s proof in disguise, we will use 
the same partitions we used in our proof of Theorem 15.2 in order to 
illustrate his method. 
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Here is how Franklin’s one-to-one function transforms the partition 
34 = 12 + 11 + 7 + 4 with an even number of parts 


O 


O 




• • • 


into the partition 34 = 11 + 10 + 7 + 4 + 2 with an odd number of parts 


• • • • 

O O 


What is so great about Franklin’s visual approach is that it is immedi- 
ately obvious that this process is reversible, which is why the function 
is clearly one-to-one. 

Here is the way Franklin's function transforms the partition 34 = 
9+8+7+5+3+2 with an even number of parts 


• • 




O 


O 


into the partition 34 = 10 + 9 4- 7 + 5+ 3 with an odd number of parts 


O 


O 




• • 


Not surprisingly, Franklin’s method makes obvious the two cases that 
are the key to proving Theorem 15.2. Again, we use the same partitions 
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we used in our proof. First, consider the partition 15 = 6 + 5 + 4 


• • • • • o 

• • • • o 

• • • o 


which produces the partition 15 = 5 + 4 + 3 + 3 


• • • • 

• • • 

o o o 


which does not have distinct terms. 

Next, consider the partition 22 = 7 + 6 + 5 + 4 


0 0 0 


o 


which is supposed to produce 


O 


O 


O 


O 


which clearly makes no sense. 

Thus, Franklin dramatically made clear the two exceptional cases 
concerning the one-to-one function that are the key to proving 
Theorem 15.2. 


Ramanujan 

We begin this section on Srinivasa Ramanujan, India’s greatest mathe- 
matician, with two very strange examples concerning partitions. 

Example 1. Let n = 10. We then ask ourselves how many partitions of 
10 can be formed using only the numbers 1, 4, 6, 9 ? We said this was a 
strange example — why would anyone ever ask such a question? 
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At least this is not a difficult question to answer. If 9 is the largest 
part, then the partition is 10 = 9 + 1. If 6 is the largest part, then there 
are two possibilities: 10 = 6 + 4 and 10 = 6 + 1 + 14-1 + 1. If 4 is 
the largest part, there are again two possibilities: 10 = 4 + 4 + 1 + 1 
and 10 = 4 + 1 + 1 + 1 + 1 + 1 + 1. And, finally, there is 10 = 
l + l + l + l + l + l + l + l + l + l. Therefore, there is a total of 
six such partitions. 

At this point in this example, we ask a totally unrelated question. 
How many partitions of 10 are there into distinct parts such that no two 
parts are consecutive? This is not a particularly strange question; you 
might even be able to imagine how such a question could arise quite 
naturally within a particular context. 

Again, it not difficult to answer this question. The simplest partition 
is of course 10 = 10. Next, we consider the case where 9 is the largest 
part, and get 10 = 9 + 1. Similarly we get 10 = 8 + 2 and 10 = 7 + 3. 
If the largest part is 6, then there are two possibilities: 10 = 6 + 4 
and 10 = 6 + 3 + 1. And that is all there are, because if 5 is the 
largest part, the remaining parts need to be 4 + 1 or 3 + 2, neither of 
which satisfies the nonconsecutive condition; and it only gets worse 
if the largest part is less than 5. Therefore, there is a total of six such 
partitions. 

The fact that the answer to these two seemingly unrelated questions 
is the same is not surprising (just think of how many questions you 
could ask for which the answer is six). 

Let's see if something interesting might be going on by trying a 
different value for n. This time let n = 12, and ask essentially the same 
questions. We see that the partitions of 12 that can be formed using 
only the numbers 1,4, 6, 9. 11 are 11 + 1, 9 + 1 + 1 + 1, 6 + 6, 6 + 
4 + 1 + 1, 6 + 1 + 1 + 1 + 1 + 1 + 1. 4 + 4 + 4, 4 + 4+1 + 1 + 1 + 1, 4 + 
l + l + l + l + l + l + l + l, l + l + l + l + l + l + l + l + l + l + l + l. 
Thus the answer in this case is nine. 

What about how many partitions of 12 into distinct parts are such 
that no two parts are consecutive? Here the partitions are 12, 11 + 1, 
10 + 2, 9 + 3, 8 + 4, 8 + 3 + 1, 7 + 5, 7 + 4 + 1, 6 + 4 + 2. Again, the 
answer is nine. 

Clearly, something quite bizarre may be going on here. The following 
example is similar, and equally strange. 

Example 2. Let n = 10; then we ask ourselves how many partitions of 10 
can be formed using only the numbers 2, 3, 7, 8 for their parts? Again, 
this is not a difficult question to answer. If 8 is the largest part, then 
the partition is 10 = 8 + 2. If 7 is the largest part, then the partition is 
10 = 7 + 3. If 3 is the largest part, then the partition is 3 + 3 + 2 + 2. 
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If 2 is the largest part then 10 = 2 + 2 + 2 + 2 + 2. Therefore, there is a 
total of four such partitions. 

Then we ask how many partitions can be formed from distinct parts 
greater than 1 such that no two parts are consecutive? So, we have 
10 = 10, 10 = 8 + 2, 10 = 7 + 3, 10 = 6 + 4; again, there is a total 
of four such partitions. 

Let’s try n = 12 to see if the pattern holds. 1’he partitions of 12 that 
can be formed using only the numbers 2, 3, 7, 8. 12 are 12, 8+2+2, 7+ 
3 + 2, 3 + 3 + 3 + 3, 3 + 3 + 2 + 2 + 2, 2 + 2 + 2 + 2 + 2 + 2. Thus the 
answer is six. 

How many partitions of 12 can be formed from distinct parts greater 
than 1 such that no two parts are consecutive? We have 12,10 + 2,9 + 3, 
8 + 4, 7 + 5, 6 + 4 + 2; again, the total is six. 

These examples illustrate two remarkable theorems discovered by Ra- 
manujan. We will soon discuss the interesting history of these famous 
theorems, but for now we merely state them. 

Theorem 15.4. The number of partitions of a positive integer n into distinct 
parts such that no two parts are consecutive equals the number of partitions 
ofn using numbers only of the form 5k + 1 or 5k + 4. 

Theorem 15.5. The number of partitions of a positive integer n into distinct 
parts such that no two parts are consecutive and all parts are greater than 1 
equals the number of partitions ofn using numbers only of the form 5k + 2 or 
5k + 3. 

Ramanujan was born in 1877 in southern India and grew up in 
a medium-sized town where his father was a clerk in a fabric shop. 
His talent in mathematics was apparent at an early age: when he 
was thirteen he independently discovered Euler’s famous formula 
e ,1T + 1 = 0. The formative event in Ramanujan’s childhood was when, 
at the age of fifteen, he borrowed a book from the local college library by 
G. S. Carr, a tutor at Cambridge, called A Synopsis of Elementary Residts 
in Pure Mathematics. This book contained about six thousand theorems, 
but virtually no proofs. Ramanujan worked through this book entirely 
on his own. 

Although Ramanujan excelled as a student in primary and secondary 
schools, and won a scholarship to attend the local college, he was 
by this time so completely focused on mathematics that he failed his 
exams at the end of the first year. The same thing happened four years 
later when he tried a college in Madras; failing his exams in 1906, 
and again a year later. However, by 1910 his mathematical work had 


Partitions 


453 


attracted the attention of several prominent Indian mathematicians, 
one of whom was willing to provide Ramanujan a monthly stipend so 
he could pursue his research. Then, in 1912, he took a position as an 
accounting clerk in the Madras Port Trust Office. The chairman of the 
office was English and the manager was a mathematician, and they 
both urged Ramanujan to contact English mathematicians about his 
discoveries. 

So, on January 16, 1913, Ramanujan wrote a letter to the foremost 
number theorist in England, G. H. Hardy at Trinity College, Cambridge. 
When Hardy opened the letter he found page after page of fantastic- 
looking theorems with no proofs at all. He was not impressed — famous 
mathematicians frequently receive unsolicited manuscripts of little 
value — and so went back to his normal routine eating his breakfast 
while reading The Times and studying cricket scores. 

But Hardy couldn’t get these extraordinary theorems out of his head, 
so later that day he sent a note to his closest colleague, J. E. Littlewood, 
asking him to have a discussion that evening after dinner. So, from 
about 9:00 until almost midnight the two of them sat in Hardy’s rooms 
at Trinity going over the manuscript, and by the end of the evening they 
knew without a doubt that Ramanujan was a genius. 

Hardy decided immediately that Ramanujan must come to Cam- 
bridge. This, however, turned out to be somewhat complicated. Ra- 
manujan was a Brahmin, and in particular his mother had very strict 
reservations about travel overseas, although in the end she finally 
acquiesced. Thus, in March of 1914, Ramanujan set sail for England. 

Ramanujan spent five years in England collaborating with Hardy and 
Littlewood, and in 1918 he became a Fellow of the Royal Society “for his 
investigations in Elliptic functions and the Theory of Numbers.” Tragi- 
cally, Ramanujan’s health suffered greatly during his years in England. 
As a vegetarian it was always a struggle for him. Early on, he had access 
to familiar foods through sources in London, but World War I soon 
cut off this supply. He was eventually diagnosed with tuberculosis and 
spent his remaining time in England in a series of sanatoriums. 

It was at this point in his life that one of the most famous incidents in 
mathematics took place. Hardy came to visit Ramanujan, who seemed 
near death at a hospital in Putney. Hardy, having come by taxi, began 
the conversation with “I thought the number of my taxicab was 1729. 
It seemed to me a very dull number.” Ramanujan immediately replied, 
“No, Hardy! No Hardy! It is a very interesting number. It is the smallest 
number expressible as the sum of two cubes in two different ways.” That 
Ramanujan knew and at this point even cared that 1729 = l 3 + 12 3 = 
9 3 + 10 3 , and is the smallest number with this property, tells us volumes 
about Ramanujan as a mathematician. 
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Figure 15.1 C. H. Hardy, 1877-1947 (left); Srinivasa Ramanujan, 1887-1920 
(right). The photograph of G. H. Hardy is reproduced by kind permission of the 
London Mathematical Society. 


Figure 15.1 reproduces Ramanujan’s passport photo for his trip back 
to India. Hardy later said when he saw this photo: “He looks rather ill, 
but he looks all over the genius that he was.” 

In 1919, when it was again safe to travel, Ramanujan was able to 
return to India. He died within a year. The astrophysicist Subrahmanyan 
Chandrasekhar, who would win a Nobel Prize in 1983, was nine years 
old when his mother told him of the death of Ramanujan. Many years 
later he said: 

I can still recall the gladness I felt at the assurance that one 
brought up under circumstances similar to my own could have 
achieved what I could not grasp. 

The fact that Ramanujan’s early years were spent in a 
scientifically sterile atmosphere, that his life in India was not 
without hardships, that under circumstances that appeared to 
most Indians as nothing short of miraculous, he had gone to 
Cambridge, supported by eminent mathematicians, and had 
returned to India with every assurance that he would be 
considered, in time, as one of the most original mathematicians 
of the century — these facts were enough, more than enough, for 
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aspiring young Indian students to break their bonds of 
intellectual confinement and perhaps soar the way Ramanujan 
had. 


Chandrasekhar is most famous for predicting in f930 the maximum 
mass of a stable white dwarf star; this maximum mass is now called 
the Chandrasekhar limit and describes the point beyond which electron 
degeneracy pressure in the core of the star can no longer counteract 
gravitational collapse, thus resulting in a neutron star or a black hole. 

Here are two of the fantastic theorems that Ramanujan sent to Hardy 
in 1913: 




I'd - x)(l - X 2 ) ■ ■ ■ (1 - X ") 


OO 


n 

n=0 


1 

(1 - x Sn+1 )(l - X Sn + 4 ) 


and 


OO 


n= 1 


x n 2 +n 

(l-x)(l-x 2 ) • • • (1 -x n ) 


oo 


n 

n = 0 


i 

(1 — x 5tt+2 )(l — x 5 "+ 3 ) ’ 


Hardy was at a complete loss as to how to prove these identities. Some 
years later it was discovered that these same identities had appeared 
in the Proceedings of the London Mathematical Society in a paper by 
L. J. Rogers in 1894. Hardy and Ramanujan then helped Rogers greatly 
simplify his original proof. These two identities are now called the 
Rogers-Ramanujan identities; over the years, many more proofs of these 
identities have been found. 

One immediate application of these identities is that they provide 
the relevant generating functions that prove Theorems 15.4 and 15.5 
(see Problem 15.17). Another immediate application is that if we let 
G(x) be the generating function in the first identity and let H{x) be 
the generating function in the second identity, then we can produce 
Ramanujan's continued fraction : 


GW 

H(x) 


+ 


x 


1 + 
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Ramanujan sent this continued fraction (and a similar one) to Hardy in 
his letter in the form 


1 


1 + 


e~ 2n 


a— 


-6 n 


i + 


i + 


{f-T 1 


y/S + 1 
2 


e 2jr/5 


Hardy admitted that these “defeated me completely; I had never seen 
anything in the least like them before. A single look at them is enough 
to show that they could only be written down by a mathematician of 
the highest class.” He went on to add: “They must be true because, 
if they were not true, no one would have the imagination to invent 
them.” 

Just to indicate the range of the theorems Ramanujan presented 
to Hardy in that first letter, here is one that involves T, the gamma 
function, which generalizes the factorial function: 

r l +(m ) 2 ^(fcTz ) 2 F(a + |) V(b+ 1) r(b-g + \) 

Jo 1+(I ) 2 i +(^) 2 271 r(fl)r(fc + |)r(fc-fl + i) 

Ramanujan also produced extraordinarily beautiful infinite series 
representations for it , though this particular one was not unknown to 
Hardy: 


^ = l-5(l) 3 + 9(H)^13(H|)= .... 

Here are two more of Ramanujan’s extraordinary representations of n 
as infinite series: 
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Much of the work Hardy and Ramanujan did together was devoted to 
partitions. One of their most celebrated results was that, for large values 
of n, 


Pin) 


e n 


4«V3' 


Another famous result of Ramanujan’s was that, for any positive n, 


p(5n + 4) = 0 (mod 5), p{7n + S) = O(mod 7), p(ll« + 6) = O(mod 11). 


Hardy, who obviously was able to judge Ramanujan as a mathe- 
matician far better than anyone else could, eventually decided that, in 
terms of pure mathematical genius, Ramanujan ranked only with Gauss 
and Euler. Here is how Hardy ranked his own mathematical talent as 
compared with Ramanujan: based on a scale from 0 to 100, Hardy gave 
himself 25, Littlewood 30, the great David Hilbert 80, and Ramanujan 
100 . 


Problems 

15.1 * (S) Find p( 8) by actually finding all partitions of 8. 

15.2 At the very beginning of this chapter, I listed all fifteen partitions of 7, 
and in Problem 15.1 you listed all partitions of 8. 

(a) Use these lists to confirm the entries in the columns for n — 7 
and n = 8 in the table of values for P{m, n) given in the text for 
0 < m < 4 and 0 < n < 10. 

(b) Suppose that in a KenKen puzzle (see Problem 2.22) you have a 
3x1 cage labeled 7+. What can you say about the three 
numbers in that cage? What if the cage is labeled 8+? 

15.3 * (S) Use Theorem 15.1 to extend the table of values for P{m, n), which 

was given in the text for 0 < m < 4 and 0 < n < 10, to find the value of 
P(4, 16). 

Then write all of the ways to express 16 as a sum of four distinct 
positive integers. 

15.4 (H,S) Naude’s original question to Euler was to find how many ways you 

can partition 50 into seven distinct parts. In our modern notation, he 
was asking to find P(7, 50). 


458 


Chapter 15 


Using only pencil and paper (and perhaps a calculator for addition 
if you prefer), extend our table of values for P(m, n) to find P(7, 50). 
The point of this exercise is to illustrate just how powerful Euler’s idea 
is. And, of course, it is vastly more powerful in today’s age of 
computers. 

15.5 (H) In this chapter we have been exclusively concerned with unordered 

partitions; that is, we do not consider the partitioning of 6 as 1 + 4 + 1 
and 1 + 1 + 4 to be different partitions. If, on the other hand, we 
decide that order matters, then such partitions are called ordered 
partitions. So, for example, the number 4 has eight ordered partitions: 

4, 3 + 1, 1 + 3, 2 + 2, 2 + 1 + 1, 1+2 + 1, 1 + 1+ 2, l + l + l + l. 

Show that for any positive integer n there are 2'' -1 ordered 
partitions. 

15.6 * Using the fact that 

(1 - x)(l - x 2 )(l - x 3 )(l - x 4 )(l - x s )(l - x 6 )(l - x 7 )(l - x 8 ) 

= 1 - x - x 2 + x 5 + x 7 + x 9 - x n - 2x 12 - x 13 - x 15 + • • • + x 36 , 

do either of the following in the spirit of Euler’s comment that “it 
suffices to undertake this multiplication and to continue it as far as is 
deemed proper to become convinced of the truth of the series.” 

(a) Continue multiplying in order by (1 - x 9 ), (1 - x 10 ), (1 - x 11 ), 
and (1 - x 12 ) to verify that Euler’s series begins with the terms 

1 - X - x 2 + X s +x 7 - x 12 . 

(b) Continue multiplying in order by 

(1 - x 9 ), (1 - * 10 ) (1 - x 15 ) to verify that Euler’s series 

begins with the terms 

1 — X — x 2 + x 5 + x 7 — x 12 — x 15 . 

15.7 (S) For n = 8 verify explicitly that the transformation used in the proof of 

Theorem 15.2 is a one-to-one correspondence between the set of even 
partitions with distinct parts and the set of odd partitions with 
distinct parts. 


15.8 


For n — 7 verify explicitly that the transformation used in the proof of 
Theorem 15.2 is a one-to-one function between the set of even 
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partitions with distinct parts and the set of odd partitions with 
distinct parts, but that it misses exactly one partition. 

15.9 For n = 12 find the unique partition with distinct parts that is missed 
by the transformation used in the proof of Theorem 15.2. Then 
explain why the term x 12 appears in the series as -x 12 . 

15.10 For n = 26 find the unique partition with distinct parts that is missed 
by the transformation used in the proof of Theorem 15.2. Then 
explain why the term x 26 appears in the series as +x 26 . 

15.11 * Use Theorem 15.3 to find p(8), p( 9), and p(10). 

15.12 (S) Use Theorem 15.3 to find p( 20). 

15.13 * (H) We used Ferrers graphs to prove that the number of partitions of a 

positive integer n where the parts are all odd and distinct equals the 
number of self-conjugate partitions of n. Verify this fact for n — 17. 

15.14 * Among the partitions of 7 there are four partitions having three parts: 

5+1 + 1, 4 + 2+1, 3 + 3 + 1, 3 + 2 + 2. 

There are also four partitions whose largest part is 3: 

3 + 3 + 1, 3 + 2 + 2, 3 + 2+1 + 1, 3 + 1 + 1 + 1 + 1. 

(a) Use Ferrers graphs to prove that for a positive integer n the 
number of partitions of n into a sum of m positive integers 
equals the number of partitions of n into a sum of positive 
integers the largest of which is m. 

(b) Verify the result in part (a) explicitly for n = 8 and the values 
m — 2 and m = 3 


15.15 This problem concerns Euler’s identity 
(1 + x)(l + x 3 )(l +x s ) • • • 

X X 4 X 9 

+ 1 - X 2 + (1 - x 2 )(l - x 4 ) (1 — X 2 )(l - X 4 )(l - X 6 ) ” l ” 

(a) Find the first ten terms — that is, out to the x 9 term — of the 
expansion of (1 + x)(l + x 3 )(l + x 5 ) • ■ • . 
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(b) We stated that (1 + x)(l + x 3 )(l + x 5 ) • • • is the generating 
function for the number of partitions n whose parts are odd and 
distinct. Verify this for all 1 < n < 9. 

(c) Using the fact that jz~ z = 1 + z + z 2 + z 3 + ■ ■ ■ , hnd the first ten 
terms of the expansion of the right-hand side of Euler’s identity. 

(d) We stated that the right-hand side of Euler’s identity is the 
generating function for the number of self-conjugate partitions 
of n. Verify this for all f <n< 9. 

15. f6 ★ (H) Verify Theorem 15.4 for n = 15. 

15.17 We mentioned that proving Theorem 15.4 amounts to verifying the 
first Rogers-Ramanujan identity: 

OO x „2 oo 1 

1 + ^ (1 - X)(l - X 2 ) • • • (1 - X") ~ n (1 - X S " +1 )(1 - X 5 "+ 4 )' 

(a) Using the fact that ^ = 1 + z + z 2 + z 3 + ■ ■ ■ , find the first 
sixteen terms— that is, out to the x ls term— in the expansion of 
the generating function n^o ( i-x^hu-* 5 " 4 ) - 

(b) By Example 1, the coefficients of x 10 and x 12 in this expansion 
should be 6 and 9, respectively. Verify that the coefficient of x 15 
agrees with your answer in Problem 15.16. 

15.18 (H) Here is a problem to illustrate that Ramanujan showed his 

mathematical talent early in life. When he was fourteen, a fellow 
student gave him the following problem to solve: 

\fx + y = 7, 
yfy + x = 11. 

Now, these simultaneous equations lead to a fourth-degree equation 
that can be solved, but not easily. Ramanujan found a solution in less 
than a minute. Find Ramanujan’s solution. 

15.19 In the years before Ramanujan came to England, he had a number of 
things published in the Journal of Indian Mathematical Society. One was 
a problem he posed asking for the value of 


1 + 2y 1 + 3Vl + • • • 
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(a) Using a calculator, evaluate enough terms of this infinite radical 
to make a good conjecture as to its value. 

(b) No one was able to solve Ramanujan’s problem, but he 
possessed a formula that could be used to solve such problems: 


x+n+a = \jax + (n + a ) 2 + x \J a(x + n) + (n + a ) 2 + (x + n)J~~- . 

Use this formula to solve Ramanujan’s original problem. 

15.20 There seemed to be no limit as to what the mind of Ramanujan could 
produce. Verify the identity 

(1 + 7 H 1 + llM 1 + 15 ) = \/ 2 ( 1 ~ ^H 1 ~ tOU ~ roM 1 ~ w) ■ 

15.21 Ramanujan sent this identity to Hardy from India: 


-2 JT 
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1 + 


1 + 




75 + 1 

2 


g27T/5 


Using a calculator, compute as many terms as necessary of the 
continued fraction to verify this identity to ten decimal places. 

15.22 Ramanujan found this infinite series for n around 1910: 


1 
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2V2 

9801 


00 


£ 

n=0 


(4 II)! 
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1103 + 26 390 n 
396 4 " 


Because this series converges so rapidly — each additional term 
produces about eight more decimal places — it was used in 1985 to 
compute ti to 17 million decimal places. Even today this series is the 
basis of all new records for the number of digits of n computed. 

Use only the first term of the series (that is, for n — 0) to obtain an 
approximation for jt. How many decimal places of accuracy do you 
get? (Assume that 7 r % 3.141 592 653 589 793 238 .. . .) 
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15.23 Verify that Ramanujan’s three congruences 

p(5»+4) = 0 (mod 5), p(7n+ 5) = 0 (mod 7). p(lln+6) = 0 (mod 11) 
hold for the first forty-four partition numbers: 


n 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 
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14 
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231 
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n 

23 

24 

25 

26 

27 

28 

29 
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1575 

1958 

2436 

3010 

3718 

4565 
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n 

31 

32 

33 

34 

35 
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37 
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6842 

8349 

10 143 

12 310 

14 883 

17 977 

21 637 


n 

38 

39 

40 

41 

42 

43 

44 
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26 015 

31 185 

37 338 

44 583 

53 174 

63 261 

75 175 
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1.2 What are the possible remainders for squares when divided by 4? 

1.3 In order to avoid trial and error, make use of the fact that the area is 
given by 

A — i xy = st(s 2 - t 2 ) = st(s + t)(s - t). 

Then, find two sets of values for s and t so that the four terms s,t,s + t, 
and s - t yield the same product. 

1.4 Use Theorem 1.1 and consider two cases, one where n is even and one 
where n is odd. 

1.5 Subdivide the Pythagorean triangle into three smaller triangles each 
having altitude equal to the radius. Then use the fact that the area of 
the Pythagorean triangle will be equal to the sum of the areas of these 
three triangles. 

1.6 This is a good time to apply a general rule of thumb about math 
competitions: “always hope for the best.” Wouldn’t it be nice if the 
answer here involved a primitive Pythagorean triangle? 

1.7 The set {6, 10, 15} is such a set. 

1.8 You will find that j is a repeating fraction in both systems. As a 
decimal, ? repeats with a period of 6, but as a sexagisimal, it repeats 
with a period of 3. 

1.11 Do not do any unnecessary work here; you may want to use your 
calculator, but you should only need to use the + key once, and maybe 
one or two other arithmetic operations. 

1.12 (b) In order to find the integer n with the property that the sum of the 
first n squares is a square, the idea is to look for an integer n so that each 
of the three terms g, n + 1, and 2 n + 1 are squares. 
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1 .13 There are several equally good approaches to this proof. One approach 
that is completely natural is to consider a number n > 1 , and simply 
observe that n must be either prime or composite. Then follow where 
either of these two possibilities leads you. 

This approach can be made more formal by supposing there is a 
composite number that cannot be written as a product of primes. Then 
you try to reach a contradiction. 

A third approach that works particularly well in this case is a form of 
induction called strong induction. The variation from ordinary 
induction is a minor one: you prove a statement is true for a small value 
such as n = 1 (or, in this case, n = 2), and then you prove that 
whenever the statement is true for all values up to and including n, it is 
also true for the next value n+ 1. 

1 .14 For squares it was sufficient to consider two cases n = 2k and n = 2k + 1; 
for cubes it is sufficient to consider three cases n = 3k, n = 3k + 1, and 

n — 3k + 2. Make sure you understand why there are three cases here; for 
example, why isn't n = 3k + 3 a new case? 

1.16 Of course, you could simply consider eight cases here: n = 8k, 

n = 8k + 1, n = 8k + 2, . . . , and n = 8k + 7. But, just as in Problem 1.14 
where we could consider only three cases instead of nine, it is possible 
here to consider fewer than eight cases. 

For the second part of this problem, suppose (by way of 
contradiction) that an integer N of the form 8k + 7 is the sum of three 
rational squares, that is, suppose 



where you may as well assume that each fraction is reduced. Then, let m 
be the least common denominator of these three fractions. 

1.17 What forms can a cube n 3 take? See Problem 1.14 for an example of 
what we mean here. 

1.18 In each of the two smallest “distinct” cubic quadruples all four 
numbers are less than 10. 

1.19 This is the original form of the problem from the eleventh-century 
Byzantine manuscript and we discussed this problem in the text in the 
section on arithmetic progressions. 
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1.21 What common difference d does s = 16, t — 9 give? Once you find 
three integer squares in arithmetic progression whose common 
difference is d, then consider d = 7m 2 to find three rational squares in 
arithmetic progression whose common difference is 7. 


Chapter 2 

2.3 When doing the algebra on this problem, resist any tendency you may 
have to expand terms immediately. It is much easier to factor out any 
common terms first. 

2.4 Do this for £3 first — that is, begin with a 7 x 7 square array, and 
subdivide it into eight triangular arrays with 6 stones each and have 
one stone left over. Then, for your proof do the same thing with a 
sufficiently large example that makes it clear that your proof works for 
any triangular number t„. 

2.6 As you do the key inductive step in this proof, you will want to use the 
fact that t„+i = (” +1 H |1+2 ) . 

2.7 You know a formula for the sum of the odd numbers starting with 1, 
and also another formula for how many numbers are in a triangle. The 
first thing to do is to find the sum of all the numbers in the triangle 
through the nth row. 

2.8 Find a formula for the sum of the numbers in the nth row. Then, in 
order to use this formula in the inductive step, express the odd row 
numbered k+ 1 as some nth row. For example, the third odd row is 
really the fifth row. 

2.9 (a) Use the quadratic formula. 

2.13 Consider two cases: (i) n is odd; and (ii) n is even but not a power of 2. 

2.17 If 72 is rational, then you can write 72 = f where the integers a and b 
have no common factor. The contradiction you are looking for is that 
both a and b are even. 

2.19 This is a straightforward matter of carrying out the algebra needed to 
evaluate both N(afi) and N(a)N(J}), using the definition of the norm 
function: N(a + £>7-6) = a 2 + 6b 2 . 
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2.22 (c) Use the fact that of the three squares in the 120x box in bottom two 

rows it is impossible for a 4 to be in two of these squares to determine 
the value of d. Then complete the bottom two rows. 
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3.6 Note that 63 = 2° • 3 2 • 5° • 7 1 , 70 = 2 1 • 3° • 5 1 • 7\ and 
90 = 2 1 ■ 3 2 • 5 1 • 7°. 

3.7 First, show that d is a common divisor of a and b. To do this, use the 
division algorithm with a and d; express the remainder r as a linear 
combination of of a and b, and conclude that r must be 0; then, do the 
same thing for b. Finally, show that if a number c is also a common 
divisor of a and b, then c < d. 

3.8 Note that V360 < 19. 

3.13 Factor the polynomial n s - n into a product of linear and quadratic 
polynomials. There is an easy way to do this. 

3.14 It is natural to begin here by observing that any integer is congruent to 
0, 1, 2, 3, or 4 modulo 5. In this problem, however, since squares are 
involved, it is slightly easier to begin by saying that any integer n is 
congruent to 0, ±1, or ±2 modulo 5. Then, what are squares congruent 
to? 

3.15 In order to use the induction hypothesis, write one 7 as 6 + 1 . 

3.16 A trick similar to the one used in Problem 3.15 will work here, writing 
one 11 as 7 + 4. 

3.19 You will need to use Euclid’s lemma, Theorem 3.3, in the induction 
step. 

3.20 Use the canonical form of the prime decomposition of a number n. 

3.24 Since this number system has so few numbers in it, you can show that a 
number satisfies the property from Theorem 3.3 simply by trying all 
possibilities— there are only six really. This is a boring, but effective, 
technique. 

3.27 Ask yourself whether this number is divisible by 91. 


Hints 467 

3.29 Let n be the number you are looking for. If n has a single prime in its 
prime decomposition, then n = 2 a , where (a + 1) - 2 = 26, and n is a 
huge number. 

If n has two primes in its prime decomposition, then n = 2 a 3 b , where 
( a + 1 )(b + 1) - 3 = 26, but 29 is prime, so this case is impossible. 
Continue in this fashion and you will find what you are looking for. 

3.30 Use what you discovered about r («) in Problem 3.28. 

3.31 The number 6 is “completely divisible” since it is divisible by every 
positive integer less than s/6; that is, 1 [6 and 2 1 6. 

3.33 To explain why Zu’s approximation is so accurate look at the fifth 
partial quotient q$. 

3.35 You should get a number that is slightly larger than shouldn't you? 


Chapter 4 

4.2 Using the notion of prime decomposition, we know that his age must 
be divisible by some number (find this number). There is then really 
only one multiple of this number which will make sense as an answer 
and it is also easy to check that this multiple fits the data provided. This 
problem can also be solved as a straightforward algebra problem using 
his age as the unknown. However, with either approach you do need to 
be careful to interpret the phrase about his son “attaining the measure 
of half his father’s life” as it was intended, which is that the son only 
lived half as long as his father lived— in other words, if Diophantus was 
n years old when he died, then his son died at the age of ~n. 

4.4 First, try solving the problem— as did Diophantus— for two 
“convenient” given numbers, say 20 and 250. 

4.5 Let the second square be (2x - 3) 2 . 

4.9 Rewrite the equation by expanding both sides and putting it in the 
form f{x) — 0 where f(x) is a cubic polynomial. Since x = 3 is a 
solution to the original equation, we know that f( 3) = 0; therefore, 
x - 3 will be a factor of the polynomial f(x). 

4.10 Begin, as did Diophantus, by taking 2x and x 2 - 1 as the legs of the 
triangle, and so the hypotenuse is x 2 + 1. Then set the perimeter to be a 
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square of the form trPx 2 , and use the condition that the sum of the 
perimeter and the area is a cube to pick a suitable value for m. We are 
looking for rational numbers here. 

4.16 (b) This equation appears in Problem 4.15, but it is also quite easy to 

find the first solution by inspection. Then you should use (x + y4l) 2 
and (x + ysfl) 2 to find the next two solutions. 

4.19 There are two values of t that yield positive integer solutions for this 
equation. 

4.20 First, solve the equation 21x + 31 y = 1 by inspection. 

4.21 The key here is to find a way to use inequalities to reduce this to a finite 
number of cases to be considered. (This same strategy worked in 
Problems 4.19 and 4.20.) The way to do this is to let x , y, z be a solution 
and, without loss of generality, assume that x < y < z. Next, conclude 
that x > 1 and that (1 + i) 3 > 2, and then discover that there are only 
two possible values for x. Consider each of these cases separately, and 
for each x do a similar analysis to get possible values for y. Then solve 
for z for each possible x, y pair to see if z is an integer. 

Chapter 5 

5.4 The numbers in question here are meant to be rational. You have seen 
this problem before in Chapter 1 in a different guise. 

5.5 Use Theorem 1.1, and also recall that Fermat called Theorem 5.1 his 
“theorem on right-angled triangles.” 

5.9 Your calculator may not be able to help with a number as large as 

2 341 - 2. In any event, use the idea we saw in the proof of Theorem 5.2 
that if d\n, then we can write 2" - 1 = ( 2 d - l)(2"~ d + 2 n ~ 2d + ■ ■ ■ + 1); so, 
since 341 = 11-31, choose an appropriate value of d for 11 and for 31. 

5.13 Assume 89 = a 2 + b 2 and recall that 13 • 89 = 34 2 + 1 . Use this to show 
that 89|(34fe - a). Then use the fundamental identity of Fibonacci and 
the fact that 89 = a 2 + b 2 = 5 2 + 8 2 . 

5.14 If a > 2, show that a — 1 is a factor of a" - l. Then, if d\ n, show that 
2 d - 1 is a factor of 2" - 1 . 

5.15 (a) In binary 2 12 - 1 = 111 111 111 111 . 
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5.17 Show two things: (i) all prime divisors of N are of the form 4 n + 1; and 
(ii) all prime divisors of N are greater than p k . One of these follows 
directly from a theorem. 

5.18 There are two equally good approaches here: induction or straight 
algebra. 

5.20 For each prime p the sum of the divisors of p is just 1 + p. Note that for 
the first three largest primes, 1 + p factors easily into a product of 
several small primes and one of the large prime factors of the number 
itself (for example, 1 + 616 318 177 = 2 ■ 7 3 ■ 898 423. The sum of the 
divisors of a prime term such as 5 s is 

1 + 5 + 5 2 + - +5 s = = 3906, which can be factored similarly. 

Compute this sum for each prime and multiply the result to get the 
sum of all divisors. 

5.23 Assume k has an odd divisor (> 1) and then use this odd divisor to 
write down a factorization for a k + 1 . 

5.25 Begin by writing 2 64 - 1 = (2 32 + 1)(2 32 - 1). 

5.29 (c) For m < n, consider the formula in a slightly different form: 

F n = F()F\ /*2 • • • F m ■ ■ ■ + 2. 

5.30 (b) Replace n by n — 40 in the formula n 2 - n + 41. 

5.31 If the integer N — (") 2 + (|) 2 is a sum of two rational squares, then 
consider the integer c 2 d 2 N. 

5.32 Recall that squares have the form 4 n or An + 1 . Also, take a look back at 
Problem 1.10. 

5.34 There is one number between 12 and 22, one between 22 and 35, 
another between 35 and 51, and another between 51 and 70. 

5.37 The key in each case is to find appropriate values for a and b. 

5.39 In each formula the “next” value of n is n + 2. 

5.40 (a) First, use the formula (") = r!(> "y r) , from Theorem 5.7 to show that 
that p \rl( p r ). 

(b) There is an easy way to do this. 
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5.43 To prove the “Star of David” property, replace 35 by (") and label the six 
vertices of the star accordingly. 


Chapter 6 


6.1 Rather than doing all of the multiplications first, and then reducing 
each number modulo 17, it is much quicker to just keep adding 5 
modulo 17. In other words, modulo 17, you get: 


5, 5 + 5 = 10, 10 + 5 = 15, 15 + 5 = 3, 3 + 5 = 8 


6.3 


(b) First show that ax = b (mod m) has a solution if and only if d\b. 
Next, you can show that if x is a solution for the congruence, then 


m 2 m 3 

* + -, * + -, x + f, 


(d-l)m 

d 


are also solutions modulo m, and they are all distinct— including x. 
Finally, you need to show that the x-value of any of the infinitely many 
solutions provided by Theorem 4.1 for the equation is congruent to 
one of these d solutions — that is, these d solutions are the only 
solutions for the congruence. 


6.4 You can save yourself considerable effort by noticing that once you 
have found one easy-to-spot inverse pair such as 2 and 10, this 
immediately gives rise to another inverse pair -2 and -10, that is, the 
much-harder-to-spot inverse pair 17 and 9. 


6.5 (c) For each pair, look at the difference a, - cq . 

6.6 In each version of the story there is some redundancy in the 
congruences, so these systems can be simplified considerably. 


6. 7 Prove the contrapositive: 

If n is composite, then (n - 1)! ^ — 1 (mod n). 

6.8 If the integer 18! is too big for your calculator, then think of writing 18! 
in its prime decomposition. The idea is to factor as many 10s out of it as 
possible and write 18! = 10° • 2 b ■ (3 f • S e ■ 7 1 ■ ll g ■ 13 ,! • 17'). Now you 
can use your calculator to compute the term in the parentheses and 
also to double it as many times as you can until some point where you 
have to take over the doubling by hand and also add on several zeros at 
the end corresponding to the 10s that were factored out. 
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6.9 Use Wilson’s theorem, and note that each even number is congruent, 
modulo p, to a negative odd number: 2=-(p - 2) (mod p), 4=-(p - 4) 
(mod p), and so on. 

6.11 The key is that if a is a primitive root, then you know that aV ^ 1 
(mod p). 

6.12 For method (3), if your calculator does not carry enough digits to 
express 14! as an integer, you can leave out the numbers 3, 5, 6, and 10 
in the calculation of 14! since 3 - 10 = 5 - 6 = 30 = 1 (mod 29). 

6.13 Since the solution x — 44! has fifty-five digits you may well need some 
kind of strategy to compute this number modulo 89. Here are two 
methods; you might want to try it both ways. 

One way is to proceed recursively through the computation of 
44! one number at a time always bringing each new product back 
within the range 1 to 88 (or-44 to 44); you can even skip some numbers 
along the way since there are some obvious inverse pairs whose 
products are 90: 3 • 30, 5 • 18, 6 • 15, 9 • 10 (and hence, congruent to 1 
modulo 89). 

Another way to do this is to use a “divide and conquer” technique, that 
is, split the forty four numbers into manageable groups so that 
multiplying each group of numbers is something your calculator can 
handle. Then reduce these individual products modulo 89 and 
multiply these remainders together. 

6.14 Assume p does not divide a, argue that a therefore has an inverse, and 
then find a contradiction using Theorem 6.4. 

6.16 (d) Look carefully at the coefficient of x on both sides of the expression 

in part (c). 
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7.6 (a) Martin Gardner warned his readers: “The task is middling hard with 

four 4s, ridiculously easy with three, extremely difficult with two.” 

(b) The key is that </>(2“) = 2“ _1 . 

Because of Corollary 1 of Fermat’s little theorem, we know that if d is 
the order of an integer r modulo 41, then d|40. 


7.12 
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7.13 Suppose that, for some t where t\rs, (be) 1 = 1 (mod p). Then we can 
write t = uv where u\r and v\s (why?). Show that r\su, and use this to 
conclude that u = r. (And, similarly, v = s.) 

7.14 You can model your proof on the example given in the proof of 
Theorem 7.2 on page 198 for the congruence x 3 = 1 (mod 13). 

7.17 Use induction to show that a 2 ^ 2 = 1 (mod 2 k ) for all k > 3, for each odd 
integer a. 

7.18 (b) In part (a) the reason we can show that a 4 = 1 (mod 15) for 
each a is because lcm(0(3), 0(5)) = lcm(2, 4) = 4; so, use the same 
idea in this general case using lem (0(m), 0(n)); also, Problem 3.5 
contains a useful formula, since 0(m) and 0(n) are both even (see 
Problem 7.7). 

7.19 Prove that p a has a primitive root by using induction to show that if a is 
a primitive root of p, then a is also a primitive root of p a . 

7.25 One way to solve this congruence is to use the list of quadratic residues 
from part (a) of Problem 7.24. 

7.26 Don’t be surprised if you end up with a multiple other than 5 19. Also, 
be aware that 0 2 counts as a square. 

7.28 Show that each term is congruent to 0 modulo k. 

7.31 Use the formula proved in Problem 5.18. 

7.36 There are two good approaches to take. One approach is to use the 
solution to Problem 5.32, and show that numbers of the form 

4 n + 2— that is, numbers that cannot be written as a difference of two 
squares — can be written in terms of three squares. 

Another approach is to start from scratch, and show that odd 
numbers can be written in terms of two squares, and even numbers can 
be written in terms of three squares. 

7.37 Use the identity 6k = (k + l) 3 + (k - l) 3 - 2k 3 . 

7.38 Even if your calculator will not compute 422 481 4 as an integer, you 
can still use it to help with this calculation by writing 422 481 as 
422 • 1000 + 481, and using the binomial theorem to expand (a + b) 4 . 
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You can then use your calculator for most of the arithmetic involving 
422 and 481. Or you can use Sage. 

7.40 For the case n = 4, think simple. 

7.42 You proved an important formula in Chapter 5 that is quite useful here 
in explaining why this sum is 0. 

7.43 It is the nature of a Diophantine equation to restrict the numbers we 
accept as solutions. So you first need to decide what integer values of d, 
s, and m even make sense in this case. Does d = 2 make sense? What 
about s = 1 or m = 0? 

The five solutions arise because \ + - > i 

a s 2 

Chapter 8 

8.6 Gauss used THopital’s rule to prove this. You will also need to use the 
fundamental theorem of calculus. By the way, l’Hdpital’s rule was not 
discovered by l’Hopital, who included it in a book he published in 
1696. Two years earlier, he had paid its true discoverer, Johann 
Bernoulli, a retainer for the rights to Bernoulli’s discoveries, including 
this one. 

8.8 Euler’s conjecture involves the value of such primes modulo 20. 

8.9 Show, and then use, the fact that v has an inverse modulo p. 

8.11 See the restatement of Theorem 6.4 following the proof of Gauss’s 
lemma. 

8.12 Use the multiplicative property of the Legendre symbol to express (— ) 
in terms of (|). 

8.15 For each of the four cases— that is, where p has the form 12k + 1, 

12k + 5, 12k + 7, or 12k + 11 — determine the pattern for how many of 

the numbers from the list 3,6,9, ■ 3 fall in each of the three 

intervals: [0, f], [f, p], [p, \p\. 

8.17 Consider the intervals [-13, -™] and [-26, -^]. 

8.18 For each of these three primes p, (~^) = (y )( ~) = (— ), since 64 = 8 2 
is a quadratic residue. 
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9.1 In this case, ignore the sieve part of the algorithm; that is, do not 

establish a mesh size ahead of time. The main thing to practice in this 
problem is looking for a combination of numbers in the sequence 
43 2 - 1817, 44 2 - 1817, . . . , that collectively forms a square. 

9.5 Suppose n has a divisor d such that 1 < d < n. Let d \ , d 2 d T ( n} be 

the divisors of n (1, d, and n are included in this list). Then conclude 
that n ■ r(n) — o{n) > 4>(n). 

9.6 (b) Don’t forget to show that 2" - 1 is composite. 

(c) Don't forget to show that is an integer. 

9.7 First, use the fact that q = 2p - 1 to show that p - 1 divides n - 1; 
conclude that 2" _1 = 1 (mod p ). Next, show that q = 1 (mod 8), which 
means that (|) = 1 , and use Euler’s criterion to show that 2^ = 1 
(mod q); conclude that 2 n ~ x = 1 (mod q). 

9.8 Use the fact that 2 n+1 12 2 " to show that 2 2 " +1 - 1 divides 2 22 - 1 (see 
Problem 9.6). Then note that 2 2 '" ' - 1 = (2 2 ”+ 1)(2 2 "- 1). 

9.9 Note that 2 9 - 1 = 7 ■ 73 and 2 29 - 1 = 1103 • 486 737. 

9.10 (c) All primes in the prime decomposition of this number are less than 
20 except for the largest prime factor, 41. 

(d) All primes in the prime decomposition of this number are less than 
20 except for the largest prime factor, 73. 

9.11 (a) If you can do modular arithmetic on your calculator (or using Sage), 
start with 2 280 modulo 561 and then work down from there. 

Otherwise, use fast exponentiation starting with 2 working your way 
up to do 2 35 first. 

(b) In drawing your conclusion here, pretend that a = 50 was a 
randomly chosen number and also that it is the only number you have 
tested. 

9.12 We need to prove that 3 *| 2 3 *+ 1 for all k > 1. You can use the induction 
hypothesis to argue that 2 3 * = — 1 (mod 3) and then use the expansion 
a 3 + 1 — (a+ 1 )(a 2 — a + 1) with a = 2 3 \ 

9.14 Look for a pattern in the binary representations of 6, 28, and 496. 
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9.15 The first conjecture is true; the second is false. For the first conjecture, 
consider three cases: p = 2, and for odd primes whether p is congruent 
to 1 or 3 modulo 4. Since 2 4 = 16, what can you conclude about 2 4k+1 ? 
What can you conclude about 2 4 *+ 3 ? 

9.17 Remember that for a perfect number, 2 n = o{n) = ]T ^ d, and that as d 
ranges over the divisors of n, so does n/d. Moreover, n remains constant. 

9.18 (a) cr(2 k ) = \ + 2 + 2 2 + -- - + 2 k = 2 k+1 - 1. 

(b) <j(p k ) = l + p + p 2 + -.. + p k = 

(d) Prove that the sequence 

ct( 1) a (2) cr (3) 

1 ’ 2 ’ 3 

is unbounded ; that is, prove that there is no largest value of - 1 . 

9.19 Consider two cases: odd and even perfect numbers (this is necessary 
even though it is extremely unlikely that any odd perfect numbers 
exist). Then, in each case, work to find a pattern, or set of rules, that 
must apply to any such number. 

9.21 Use the formula for Fermat numbers given in Problem 5.29. 

9.22 Prove that d = gcd(2'" - 1, 2" + 1) = 1 by letting 2"' - 1 = sd, 

2 n + 1 = td, and looking at 2" m modulo d. Show that d\2, and therefore, 
d = 1. 

9.27 (b) Since p j 2 2 + 1, we have that 2 2 " = -1 (mod p). Square both sides. 


(c) If r is the order of 2 modulo p, then use part (b) to conclude that 
r = 2 s for some s < n + 1 . Then show that s = n+ 1 by finding a 
contradiction if s < n + 1. 


(d) First, use Corollary 1 of Fermat's little theorem to show that 
2" +1 1 p - 1 . This allows you to conclude that 8 | p - 1 and use 
Theorem 8.3. 


(e) Your goal here should be to show that 2" +1 1 


*4 
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10.6 Use Problem 10.2, and for each pair of twin primes p and p + 2 
determine whether p - 4 and p + 6 are prime. 

10.8 (a) Think about writing a given partial fraction, such as \ + | ^ , with 

a common denominator. You can then use prime factorization to show 
why the denominator is not a divisor of the numerator. 

(b) Think about taking a given partial fraction, such as 
v = Y + 5 + 3 + ? + ''' + i 5 + 25 i we can construct an integer N to 
multiply by x so that, term by term, the product Nx turns into an 
integer, except for a single term. Hence x itself could not be an integer. 
We put all the odd numbers from 1 through 19 into N to take care of 
the fractions with odd denominators. We also want N to contain at 
least some powers of 2 to take care of fractions such as and But if 
we put in too many 2s — for example, four in this case — then we would 
have overdone it, since all fractions would then have been taken care 
of. The key is to put in just the right number of 2s. So, letting 
N = 2 3 • (1 • 3 • 5 • ■ - 19), when we compute Nx we see that since N 
contains all of the integers from 1 to 20 with a single exception, this 
product will not be an integer. This idea works in general. 

10.11 (a) Each prime in this sequence is the largest prime less than twice the 

previous prime. 

10.13 For n > 3, treat the even and odd cases for n separately, writing n = 2k 
and n = 2k + 1, and invoking Bertrand’s postulate for the integer k. 

10.14 (b) Use induction and the result from part (a). 

10.15 Look at 10! = 3 628 800 = 2 8 • 3 4 • 5 2 • 7, for example, and think about 
why this fails to be a square. 

10.17 To prove that two statements are equivalent, you first assume that one 
statement is true and use that information to prove the other 
statement; then you reverse the process by assuming the second 
statement is true and use that information to prove the first statement. 

10.20 Note that 7 and 91 are not relatively prime. 

10.22 If p — a + nd is prime in an arithmetic progression, it appears as the uth 
term. So the term that appears p terms later in the progression will be 
a + (» + p)d. 
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10.24 (d) Test the numbers 509 - 2, 509 - 4, 509 - 8, , 509 - 256 to see if 

any are prime, using the tables of primes at the back of the book if 
necessary. 


Chapter 11 

11.2 (b) Your search should try each of y = 0, 1, 2, 3 887 199. Here is 

a program using Sage to do this. First we define a function “square(n)” 
that produces a list of all y values up to n such that 15 u + 8 11 - lly 2 
is a square. (Be very careful that all indentations line up exactly as 
shown.) 

def square(n): 

y=[ ] 

for i in range(0,n): 

if is_square(15~ll + 8~11 -11 * (U2)) ==True: 
y.append(i) 
return y 

Then all you do is ask Sage to evaluate square(887 200). 

11.4 (c) Use the formula twice. 

11.9 First use Fermat’s little theorem to show that a = b (mod p), and so you 
can write a - b = pk for some k. Then use the binomial theorem to 
expand a p - b p = (b + pk) p — b p . 

11.10 For p > 2 see the proof of Theorem 11.2. For p = 2 think about 
Pythagorean triangles. 


Chapter 12 

12.4 (c) Write the fraction ^ as the sum of two unit fractions. 

12.8 Think of the golden ratio as | 

12.9 Draw two diagonals E B and AC and let F be their point of intersection. 
Then show that triangles ABE and FAB are similar and use that fact to 
derive an equation that will eventually yield the familiar equation 

x 2 — x — 1 = 0 where x is the length of E B. 
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12.15 First, verify that the formula holds for n — 0 and n = 1 . Then, assume 

that the formula holds for all integers up to and including some integer 
n > 1 , and use the identities a 2 = a + 1 and ft 2 = p + 1 to prove that the 
formula holds for the integer n + 1 . 

12.19 (b) Use the basic formula for geometric series in Problem 5.18. 

12.21 (c) You can count tilings on the left side of this identity by grouping 

the tilings according to which cell i is covered by the first square — that 
is, all cells to the left of cell i are covered by dominoes. 

12.27 Make sure to write out enough terms to get to where the sequence 
starts over with 1 , 1 , 2 ,.... 

12.28 As we did in Chapter 10, use factorials to find n consecutive Fibonacci 
numbers that are all composite. 

12.30 Let s„ be the number of subsets of the set {1, 2, 3, ... , n) that contain 
no pair of consecutive numbers. Consider S\ and S 2 , and then prove 
that the sequence si, S 2 , S 3 , . . . follows the same recurrence relation as 
the Fibonacci sequence. 

12.31 To find the number of possible arrangements with n chairs, first 
consider the number of arrangements with a man in the first chair, and 
then consider the number of arrangements with a woman in the first 
chair. 

12.32 Find a first-order recurrence relation for the sequence c\ , C 2 , C3, . . . . 
Then use induction. 

12.36 (b) It will be useful to first verify that 1 + a 2 = a \/5 and 1 + ft 2 = ~p*/5. 

12.37 Prove a recurrence relation for these determinants. To do so, first 
expand an n x n determinant about the nth row. 

12.38 Consider the Fibonacci number Fg. 

12.40 Try using the generating function f{x) for the Fibonacci sequence as 
an actual function. That is, find an appropriate value of x to use. And 
don’t forget to make sure that this infinite series converges for this 
value of x. 
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Chapter 13 

13.2 Think about what you would do with an encrypted message if you did 
not know the key word, but you did know the length of the key word. 

13.9 If n and M are not relatively prime, one of p or q divides M (but not 

both, since M < n). So assume that p\M, and that q and M are relatively 
prime. Then prove that C b = M (mod p) and C b = M (mod q). 


Chapter 14 

14.1 Start at the bottom of the continued fraction and first write 4 + i as y ; 
then continue upward writing 


3 -f — = 3 + — 

1 21 

4 + - — 

5 5 


3 + 


5 

21 ’ 


and so on. 


n.u — — 7 t . 

x [q \,qz qA 

14.7 By definition, C\ = = { . We compute c 2 = 1 + \ = § = g . Now use 

Theorem 14.2— and a table— to find c 3 , c 4 , and c 5 . 

14.11 First find the gcd of 900 and 628, thus reducing the equation to the 
form ax + by = c where gcd(a, b) = 1. Next use continued fractions to 
solve the equation ax + by = 1 (see Problems 14.1 and 14.7), and then 
scale your answer up by a factor of c. 


14.12 See Problem 14.7. 

14.14 1 day = 24 hours = 60 x 24 minutes = 60 x 60 x 24 seconds. 

14.15 First, write x - 1 = p=-|. 

14.16 The number x is the positive solution to the equation x = 

14.17 Use the quadratic formula in reverse to find appropriate values for a, b, 
and c. 
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14.19 (a) You can do this either as we did in the text to show that [2, 4 ] = V5, 

or as we did to find the irrational number which equals [1,2, 3. 4. 5 ]. 

14.24 Use Theorem 14.10. 

14.28 The only part that is tricky is verifying (*)' = y. Express ~ in the form 
u + Vsfn by first writing y = y ^= , and then rationalizing the 
denominator by multiplying by yyQ = . Do the same thing for (^)', but 

here you first rationalize the denominator for * = and then take 

the conjugate. 


Chapter 15 

15.4 You should quickly notice a simple pattern in the row corresponding to 
m = 2. Also, near the end, if you like, you should be able to skip some of 
the calculations since the only goal is to find P(7, 50). 

15.5 A good way to prove this is to represent partitions in a visual way. For 
example, for n — 5 the three diagrams 

1 1 1 1 1 1 1 1|1 1 1 1 1 1 1 1 1 1 
represent, respectively, the three partitions 4 + 1, 2 + 3, and 1 + 2 + 2. 

15.13 A good approach is to break this into cases depending on the size of the 
square in the upper left corner of the Ferrers graph. For example, if the 
square has size 3x3 that leaves eight dots to place symmetrically on 
each side. You also have to argue that the partitions you get this way 
are the only ones where all parts are odd and distinct. 

15.16 There are fourteen partitions of each type. 

15.18 Hope for the best, and assume that x and y are both squares. 


Solutions to Selected Problems 


Chapter 1 

1 .2 Since squares have remainder 0 or 1 when divided by 4, it is impossible 
for the sum of two squares to have a remainder of 3 when divided by 4; 
hence this equation has no solution. 

1.3 Since the area A = \xy = st(s 2 - t 2 ) = st(s + t)(s - t ) the idea is to 
choose two sets of values for s and t so that the four terms s,t,s + t, and 
5 _ f yield the same product. Thus for s = 5 and t — 2 the four terms are 
5, 2, 7, and 3; whereas for for s = 6 and t = 1 the four terms are 6, 1, 7, 
and 5. Obviously, these two products are equal; therefore, {20, 21, 29} 
and {12, 35, 37} each has area 210. 

Similarly, for s = 15 and t = 6 the four terms are 15, 6, 21, and 9; and 
for for s = 18 and t = 3 the four terms are 18, 3, 21, and 15. Again, it is 
obvious that these two products are equal. Thus {180, 189, 261} and 
{108, 315, 333} each has area 17 010. 

1.5 We may as well assume that the triangle is a primitive Pythagorean 

triangle, since if it is not, then it is similar to a smaller triangle that is a 
primitive Pythagorean triangle, and if that smaller triangle has an 
incsribed circle whose radius is an integer, then so will the original 
larger triangle. This allows us to use Theorem 1.1 . 

Subdivide the triangle into three smaller triangles by connecting the 
center of the inscribed circle to the three vertices of the triangle. Then, 
since the area of the triangle equals the sum of the areas of these three 
triangles, and letting r be the radius of the circle, we get 

11111 

2*y = 2 rx + 2 ry + 2 rZ = 2 r( * + - v + z) ' 

and canceling the \ } this can be rewritten as 

xy = r(x + y + z), 

that is, as 

2st(s 2 — t 2 ) = r (2st + (s 2 - t 2 ) + (s 2 + t 2 )) = 2rs(t + s). 

Solving for r, we get r = t(s - t). Therefore, r is an integer. 


482 


Solutions 


1.6 For the moment, we will assume that the solution occurs when the 
right triangle is a primitive Pythagorean triangle. By the solution to 
Problem 1.5, then, we know that 12 = t(s - t). Solving for s, we get 

s = f + y • Since s must be an integer, the possible values for t are 
1, 2, 3, 4, 6, 12. The hypotenuse is given by z — (t + ^) 2 + t 2 , and the 
largest value for the hypotenuse is achieved when t = 12. The largest 
hypotenuse, therefore, is 13 2 + 12 2 = 313. 

Any triangle that is not a primitive Pythagorean triangle will 
necessarily have a smaller hypotenuse. For example, suppose such a 
triangle is similar to a primitive Pythagorean triangle where the radius 
of the inscribed circle is 6. Then, repeating the argument above, we get 
a hypotenuse for this triangle given by z = (t + f) 2 + t 2 , and the largest 
value for this hypotenuse is achieved when t — 6, and so this 
hypotenuse would be 7 2 + 6 2 = 85. Thus, in this case, the hypotenuse 
of the original Pythagorean triangle would only be 170. 

1.7 The set {6, 10, 15} is such a set. These numbers are not pairwise 
relatively prime (that is, no pair of these numbers is relatively prime), 
and yet, the only common factor of all three numbers in this set is 1. 

This can’t happen with Pythagorean triples. If { x , y, z } is a primitive 
Pythagorean triple, then x, y, and z must be pairwise relatively prime. 
Suppose that p is a prime that divides both x and y; then p divides 
x 2 + y 2 , which means that p divides z 2 . Therefore p must divide z. 
Hence jx, y, z) cannot be a primitive Pythagorean triple. The argument 
is identical for the other two cases where there is a prime that divides 
both x and z, and where there is a prime that divides both y and z. 

1.8 As a decimal ^ — 0.013 333 333 . . . ; since ^ as a sexagesimal 

^ = 0.0, 48, 0, 0, 0, 0, 0 Asa decimal we see that i repeats with a 

period of 6: \ = 0.142 857 142 857 . . . ; but, by long division, 

1 _ 8 34 17 8 34 17 

7 — 60 + 60 2 60 3 + 6Ch + 6Cb 60^ + ’ ’ ' 

and so, as a sexagesimal, the fraction 1/7 repeats with a period of 3: 
i = 0.8, 34, 17, 8, 34, 17 

That the reciprocal of the prime number 7 has period 6 in the 
decimal system is no accident. The great early nineteenth-century 
mathematician Gauss — who began thinking about this when he was 
eleven — proved that if 10 is a primitive root of a prime p, then 1/p has 
period p — 1 . (We say that 10 is a primitive root of 7 because the six 
numbers 10, 10 2 , 10 3 , 10 4 , 10 s , 10 6 have all six possible nonzero 
remainders when divided by 7 — namely, 3, 2, 6. 4, 5, 1.) 
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1.9 The triangles range from 45.21 degrees at the top of the tablet to 58.11 
degrees at the bottom with spacing that seems fairly even, as small as 
about a half a degree in several places, but only one rather noticeable 
jump where the gap is almost two degrees between rows 11 and 12. 
What is especially intriguing is that by adding a sixteenth row 
corresponding to s = 125 and t = 64, this gap is filled in almost 
perfectly. There is no other valid combination of s and t under 125 
missing. Was this simply another error? Was 64 = 2 6 considered too 
large because of the exponent? It also seems strange not to have taken 
another step closer to 60 degrees. 

1.10 First, for n — 1, the formula becomes 1 = l 2 , which is obviously true. 
Next, we assume that the formula is true for a value n > 1; that is, we 
assume that 

1 + 3 + 5 + • • • + (2 n - 1) = n 2 . 

Then we consider the formula for the value n + 1 and use the above 
assumption to transform the left side of this formula into the desired 
right side for the formula: 

1 + 3 + 5 + • ■ ■ + (2 n + 1) = (l + 3 + 5 + • • • + (2 n — 1)) + (2 n + 1) 

= n 2 + (2 n + 1) = (n + l) 2 . 

Therefore, the formula is true for all n > 1. 

1.11 Using the assumption, we can write 

1 +3 + 54 + 95 + 97 + 99 = 49 2 + 99 = 2401 + 99 = 2500 = 50 2 , 

as desired. 

1.12 (a) First, for n = 1, the formula becomes l 2 = 1 (i+D (2 i+i) w p 1 j c p 1 j s 

O 7 

obviously true. Next we assume that the formula is true for a value 
n > 1; that is, we assume that 


l 2 + 2 2 + 3 2 + • • • + n 2 = n( ”+ 1)(2n+1) , 

6 


Then we consider the formula for the value n + 1 and use the above 
assumption to transform the left side of this formula into the desired 
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right side for the formula: 

l 2 + 2 2 + 3 2 + • • • + n 2 + (n + l) 2 


= ( „ + 1 ) (^ t » + „ +1 )^ ( „ +1) ( 2n2+ 6 7 " +6 


(rt + l)(n + 2)(2n + 3) 

6 

exactly as desired, since (' l+1 )(»+2H2n+3) _ («+i) «»+i)+i) (2(n+n+i) . 

Therefore, the formula is true for all n > 1 . 

(b) In order to find the integer n with the property that the sum of 
the first n squares is a square, the idea is to look for an integer n so that 
each of the three terms g, n + 1, and 2 n + 1 are squares. Trying 
multiples of 6, you quickly come to n = 24, which works because 
^ = 4 is a square, and 24 + 1 = 25 and 2 • 24 + 1 = 49 are also squares. 

1.13 This is almost self-evident. If the number n is prime, then it is the 

product of one prime. If n is composite, then n — rs where both r and s 
are smaller than n. This argument can now be repeated on the two 
factors r and s and, in turn, repeated on any subsequent factors that 
arise. Since at each stage the factors get smaller, this process eventually 
has to stop. This is an infinite descent argument. 

This argument can be made more formal as follows. Suppose there is 
a composite number that cannot be written as a product of primes. 
Then there is a smallest such composite number, call it n. But n is 
composite, so n = rs where both r and s are smaller than n. Therefore, 
by the choice of n, both r and s can be written as a product of primes, 
hence so can n, which is a contradiction. 

This can also be proved using a form of induction called strong 
induction. The variation from ordinary induction is a minor one: you 
prove a statement is true for a small value such as n = 1 , and then you 
prove that whenever the statement is true for all values up to and 
including n, it is also true for the next value n + 1 . In this case, we first 
observe that the statement is true for the smallest value, namely, n = 2. 
Now for a number n > 2 assume that the statement is true for all values 
< n. If n + 1 is prime, we are done. If, on the other hand, n + 1 is 
composite, then we can write n + 1 = rs where r, s < n + 1 . But then 


n(n + 1)(2 n + 1) 


+ (n + l) 2 
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r, s < n, and so, by our assumption, r and s can each be written as a 
product of primes, hence so can n + 1 . 

1.15 By considering the three cases * = 3 n, x — 3n + 1, and x = 3n + 2 it is 
easy to see that x 2 must have the form 3 n or 3n + 1 . Therefore x 2 + 3y 2 
must also have one of these two forms (because 3 obviously divides 
3y 2 ), and hence cannot have the form 3/7-1. 

1.16 There are four cases to consider: n = 4k, n = 4k + 1, n = 4k + 2, and 

n — 4k + 3. Squaring 4k yields a number of the form 8k; squaring 4k + 1 
or 4k + 3 yields a number of the form 8k + 1; and squaring 4k + 2 yields 
a number of the form 8k + 4. Thus any square must have one of these 
three forms. It is therefore impossible to add three or fewer squares to 
achieve a sum of the form 8 k 4- 7. 

Now suppose that an integer N of the form 8k + 7 is the sum of three 
rational squares, that is, suppose 

- G ) 2 + © 2 + ( 7 ) 2 ' 

where we may as well assume that each fraction is reduced. Then let m 
be the least common denominator of these three fractions. Thus 


/tm\2 

/ mb\2 

( mc\ 

( — I + 1 

— + 1 


V d ) 

v e ) 

1 f) 


where the right-hand side is now a sum of square integers (since m is a 
common denominator). Therefore, m 2 cannot be of the form 8k + 1 
because then trfiN would be of the form 8k + 7, which we know is 
impossible since nf-N is the sum of three squares. Hence /zz 2 must be of 
the form 8k or 8k + 4. This means that 4 divides ni 2 . So the left-hand 
side nfNis of the form 4k, and each square on the right-hand side is 
either of the form 4k or the form 4k + 1 , but the only way this can 
happen is if all three squares on the right-hand side have the form 4k. 
Since m is even and is the least common denominator, one of the 
denominators — without any loss of generality we can assume this 
denominator is d — is not only even but also is such that d has the same 
number of 2s in its prime decomposition that m does. Hence 'jj is odd. 
Furthermore, since d is even, a must be odd (we assumed each fraction 
is reduced), so since is also odd, we can conclude that is odd, 
but this is impossible because we previously concluded that this square 
has the form 4k. Therefore, no integer of the form 8k + 7 is the sum of 
three rational squares. 
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1.17 A cube n 3 must have the form 7k, 7k + 1, or 7k + 6. This can be checked 
by cubing each of 7k, 7k + 1. 7k + 2, 7k + 3, 7k + 4, 7k + 5, 7k + 6. If 
none of a 3 , b 3 , or c 3 has the form 7k, it is impossible that a 3 + b 3 = c 3 . 
(For example, if a 3 = 7k + 1 and b 3 = 7 / + 1 , then a 3 + b 3 = 

7k + 1 + 7/ + 1 = 7(k + j) + 2, which can’t be a cube.) Hence one of 
the three numbers is divisible by 7. 

1.18 The smallest solution is the one that is the most surprising: {3. 4. 5, 6}. 
Another solution, the third smallest, is {7, 14, 17, 20). We leave the 
solution that is the second smallest for you to find on your own. (Note 
that we do not consider the solution {6, 8, 10, 12} to be distinct from 
{3, 4, 5, 6}.) 

1.19 This is the same as the problem discussed in the text from the 
eleventh-century Byzantine manuscript, since solving in the rationals 
is equivalent to solving b 2 - a 2 = c 2 - b 2 = Strt 2 in the integers. In this 
case, as we saw, 5m 2 = 720 works, so m = 12 and we get 

(f|) 2 , (f|) 2 , (ff) 2 as the three squares. 

1.20 For s = 2 and t = 1, you get s 2 - t 2 = 3 and st = 2 so d = 24 and the 
three squares are l 2 , 5 2 , 7 2 . (We used s 2 + t 2 = 5 to find the middle 
square and then used the common difference d = 24 to find the other 
two squares, 25 - 24 = 1 and 25 + 24 = 49.) 

For s = 3 and t — 2, the three squares are 7 2 , 13 2 , 17 2 ; for s = 4 and 
t— 1, the three squares are 7 2 , 17 2 , 23 2 ; and for s = 5 and t = 2, the 
three squares are l 2 , 29 2 , 41 2 . 

Note: for values < 7, only s = 5 and t = 4 yields a common 
difference of the form 5m 2 . 

1.21 Here s 2 - t 2 = 175 and st = 144, so d = 100 800. We can factor d as 

d = 100 800 = 224 • 450 = 126 • 800 

where 224 + 126 = 800 - 450, so Fibonacci’s method will work. 

Thus 

100 800 = 126 • 800 = 675 + 677 + • • • + 923 + 925 = 463 2 - 337 2 
and 

100 800 = 224 • 450 = 227 + 229 + • • • + 671 + 673 = 337 2 - 113 2 . 

So 113 2 , 337 2 , 463 2 forms an arithmetic progression with common 
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difference 100 800. Hence 

(113/120) 2 , (337/120) 2 , (463/120) 2 

forms an arithmetic progression with common difference 7. (Note that 
in using this method, we use the two factorizations of 100 800 and the 
sums of consecutive odd integers to find the three squares rather than 
finding 337 by computing s 2 + t 2 = 16 2 + 9 2 = 256 + 81 = 337 and 
then adding and subtracting 100 800 to 337 2 to find 463 and 113.) 


Chapter 2 

2.1 For n = 1, = 1 and 1(1 2 +1) = 1 , so the formula holds true. Now, assume 

the formula is true for the nth triangular number, where n > 1, and 
consider the next triangular number f„ +1 . Thus we get 

t n+ 1 = tn + (n + 1) = + (n+ 1) = (n+ 1)(^ + 1) = ^ +1 ^ + 2 > , 

and the formula holds for t n+ 1 since (n+1) 2 ( ”+ 2) = («+b ((»+h+h _ 

2. 7 Since the numbers are in a triangle, the last number in the nth row is 
just the f„th odd number (for example, 29 is the last number in the fifth 
row and 29 is the fifteenth odd number, where 15 = f 5 ). So the sum of 
all the numbers in the triangle through the nth row is 

1 4- 3 + 5 + • ■ • + (2 t n — 1) = t, 2 . 

Thus the sum of the numbers in the nth row is t n 2 - f„_! 2 , and with a 
little algebra this can be shown to be n 3 (this is Problem 2.3). 

This can also be proved using a visual geometric argument. The nth 
row whose sum is n 3 starts with the number n{n - 1 ) + 1 . Think of an 
n x n x n cube constructed from lxlxl cubes. What do the n 
numbers n(n - 1) + 1, n(n - 1) + 3, n(n - 1) + 5, . . . , n(n - 1) + 2n - 1 
represent? Well, the number n(n - 1) + 1 represents an nx(n- 1) stack 
of 1 x 1 x 1 cubes with 1 cube left over; the number n(n - 1) + 3 
represents an bx(b-I) stack of 1 x 1 x 1 cubes with 3 cubes left over, 
and so on. So the sum of the n numbers in the nth row represents an 
n x n x (n - 1) block with 1 + 3 + 5 + • • • + (2 n - 1) = n 2 cubes left over, 
just enough for an n x n layer of cubes to complete the n x n x n cube. 

2.8 The base case, k = 1, is clear since 1 = l 4 . Assume the statement is true 
for a positive integer k. 

First, we find a formula for the sum of the numbers in the nth row. 
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Since the array is triangular, the first number in the nth row is 

+ 1, and the last number in the nth row is '’("+ 1 ) , Therefore, by 
symmetry, the average of the numbers in the nth row is 

l^(n-l)n | 1 n(n+ 1)^ 1 fn 2 - n + 2 + n 2 + n^ n 2 + 1 
2 V 2 + + 2 J ~ 2 v 2 / = 2 ' 

So the sum of the numbers in the nth row is ” ( " 2 +1) . 

Alternatively, we could also find this formula by arguing that since 
the last number in the nth row is t„ and the last number in the previous 
row is f„_i, the sum of the numbers in the nth row is just 


(1 + 2 + 3 + • • • + t„) — (1 + 2 + 3 + ■ • ■ + t n - 1) 

_ , . t n (t n + 1) + 1) 

- k - k , - 2 2 ’ 

and, with a little algebra, this reduces to n<n2 + l} . 

Now we are ready for the inductive step. Since the odd row we 
number as k + 1 corresponds to the nth row where n = 2k + 1, the sum 
of the numbers in the first k+ 1 odd rows equals the sum of the 
numbers in the first k odd rows plus the sum in row n = 2k + 1, that is, 

, (2k + l)({2k+ 1) 2 + 1) 4 , 

k 4 + 2 -2—- ' L = k 4 + (2k + 1)(2 k 2 + 2k + l) 

= k 4 + 4 k 3 + 6k 2 + 4k + 1 = (* + l) 4 , 


as desired. 

2.9 (b) In general terms, the test is to solve the equation N = n( - n + 1] for n; if 
you get an integer, then At is a triangular number. However, it is more 
useful to solve for n in general, and then say instead that the test is that 
-i+yi+8 n must b e an integer, that is, Vl + 8 N is an odd integer. 

2.10 First, write »(»+h(»+2)(»+3) as i . «(*+ 3) . (»+ iK»+ 2 ) _ Then ^ let m = nin+21 f and 
we get 

n(n + l)(n + 2)(n + 3) 1 n(n + 3) (n+l)(n + 2) m(tn+ 1) 

8 = 2 2 2 = 2 ’ 


since n(n + 3) + 2 = (n + l)(n + 2). 
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2.11 Assume t n is a square, that is 


Then 


n(n + 1) 
2 


= rtf'. 


8m 2 (8m 2 + 1) 

4n(n+l) = fsm 2 — 0 


= (2m) 2 (4n(« + 1) + 1) = (2m) 2 (2n + l) 2 


is a square. 


2.12 (b)p„ = 1+4+7+- • - + (3n-2). 

(c) h n = 1 + 5 + 9 + • • • + (4h - 3). 

(d) p„ = ^11. 

2.13 Obviously odd numbers are trapezoidal since 2k + 1 = k + (k + 1). If n is 
an even positive number, but not a power of 2, then we can write 

n = 2 m (2k + 1), where k > 0, and then we can express n as a sum of 
2k + 1 consecutive integers with 2 m in the middle. 

For example, for n = 112, we write 112 = 16-7, and then we can 
express 112 as a sum of seven consecutive integers with 16 in the 
middle: 


112 = 13 + 14 + 15 + 16 + 17 + 18 + 19, 
and so 112 is trapezoidal. 

As another example, for n = 18, we write 18 = 2 • 9, and the sum, 
with 2 in the middle, is 

18 = (-2) + (-l) + 0 + l + 2 + 3 + 4 + 5 + 6, 

but then in this case we simply drop (-2) + (-1) + 0 + 1 + 2 to get 
18 = 3 + 4 + 5 + 6, and so 18 is trapezoidal. 

Conversely, assume that n is a trapezoidal number. We can write n as 
a sum of s > 1 consecutive numbers beginning with an integer k + 1, so 

n — (k + 1) + (k + 2) + • • • + (k + s) = tk +s — tk 

( k + s)(k + s + 1) k(k + 1) s(2k + s + 1) 

~ 2 2 = 2 ' 

Now, one of s or 2k + s + 1 must be odd (and greater than 1), and the 
other even, so n cannot be a power of 2. 
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For a visual geometric proof of part of this result, see Mathematics 
Magazine 70 (October 1997), 294. 

2.14 (b) There are two good options here. You can draw a diagram of stones 

showing how two odd numbers can be joined to achieve an even 
number; or, alternatively, you can say, or show, how each odd number 
has one extra stone, so if these are taken away, the remainders will be 
even and can be lined up to produce an even number, and the two 
extra stones can be placed at the end, resulting in an even 
number. 

Euclid’s proof of Proposition 22 did not use stones — instead he used 
line segments which were joined to represent addition, but the idea is 
the same and went like this. Since each of the numbers is odd, a stone 
can be subtracted from each and the remainders will be even; so the 
sum of these remainders will be even. But the multitude of the extra 
stones is also even. Therefore the whole is even. 

2.17 Assume that s/2 — f where the fraction ^ is reduced. Then 2 = or in 
other words, a 2 = 2b 2 . So a 2 is even, but then a must also be even. 
Therefore, we can write a = 2c and rewrite the previous equation as 

2 b 2 — a 2 — (2c) 2 = 4c 2 . Thus b 2 = 2c 2 , so b is also even. This contradicts 
our assumption that a and b are relatively prime and completes the 
proof. 

Note that this is at heart an infinite descent argument, because b can 
be written as 2d and so f Thus, if s/2 can be written as a fraction |, 
then it can also be written as a fraction with smaller numbers |, but 
then it could be written as a fraction e/ f with still smaller numbers, 
and so on. Such an infinite descent yields a contradiction without our 
needing to assume that | is reduced at the beginning. 

2.18 Assume, by way of contradiction that log 2 = f for two positive 
integers a and b. Thus 10 ll/b = 2, and 10" = 2 b ; that is, 2" • 5" = 2 b . But 
this is impossible since 5 is not a factor of 2 b . Therefore, log 2 is 
irrational. 

2.21 For the twelve notes of the Pythagorean scale we get 
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For the tuning with A = 440 Hz we get 


c 

C(t 

D 

Eb 

E 

F 

260.74 

278.44 

293.33 

309.03 

330.00 

347.65 


Ftt 

G 

Gtt 

A 

Bb 

B 

C 

371.25 

391.11 

417.66 

440 

463.54 

495.00 

521.48 


Chapter 3 

3.7 Since d e S, we can write d = xa + yb, for some integers x and y. First, 
we show that d divides a. By the division algorithm we can write 

a — qd + r, where 0 < r < d . 

Solving for r, and substituting for d, we get 

r — a - qd = a - q(xa + yb) = (1 - qx)a + ( -qy)b . 

At this point it looks as though r € S, because we have expressed r as a 
linear combination of a and b. However, we know that r <£ S because 
r < d, and d is the least element in S. The only possibility, therefore, is 
that r = 0. Hence d does in fact divide a. Similarly, we can show that d 
divides b\ so d is a common divisor of a and b. 

Now, suppose that c is also a positive common divisor of a and b. 
Then c divides xa + yb = d, that is, c divides d, and since we know 
d ^ 0, we conclude that c < d. Therefore, d is the greatest common 
divisor of a and b. 

3.13 First, it is clear that 2 divides n s — n, since » s is odd or even when n is 
odd or even. Next, we write 

n 5 -n = n(n 4 - 1) = n(n 2 - 1 )(n 2 + 1) = n(n - 1 )(n + 1 ){n 2 + 1). 

We can easily see that 3 divides n s — n since 3 must divide one of n - 1 , 
n, or n + 1 . 
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Now we consider five cases: 

n = 0, n=l, ns 2, ns 3, 22 = 4 (mod 5). 

For the first two of these cases, and the last, one of the three linear 
terms in the expression 21(22 - 1 ){n + l)(n 2 + 1) is divisible by 5, and for 
the cases 22 = 2 and 21 = 3, the quadratic term in this expression is 
divisible by 5. Hence 5 divides 22 s - 22 for any 22 . 

Therefore, 30 divides 22 s - n for any 22 . 

3.19 The statement is clearly true for 21 = 1 . Assume the statement is true for 
an integer 22 > 1. ff p|a 1 a 2 • • ■ fl n +i, then p\{U]a 2 ■ ■ ■ a n ) • p n+u so, by 
Theorem 3.3, Euclid’s lemma, p|aifl 2 ■ • • a n or p\a n+ \. In the first case, 
by the induction hypothesis, p|fl; for some 1 < 2 < 22 ; otherwise, p\a n+ \. 
Therefore, p|a,- for some 1 <2 <27+1, and the statement is true for all n. 

3.22 Consecutive integers are relatively prime, so if the product of two 
consecutive integers a and b is a square, then a and b is each a square, 
which is impossible since no two squares differ by 1. 

Suppose 27 , 22 + 1, 27 + 2 are three consecutive positive numbers 
whose product is a square. If n + 1 is even, then these three numbers are 
pairwise relatively prime and, as above, each must be a square, which is 
impossible. If 77 + 1 is odd, then n and 72 + 2 are even, but no prime 
other than 2 can divide more than one of these three numbers. 
Therefore, 72 + 1 is a square, and also the product 22(22 + 2) is a square. 
Now, we write 22 = 2k, and we get 22(22 + 2) = 2 z k(k + 1). Therefore, 
k(k + 1) is a square, which, as we have just seen, is impossible. Hence 
three consecutive positive numbers cannot have a square product. 

3.23 We saw in Chapter 2 that the number 2 + 7-6 is irreducible; in fact we 
called it “prime” there for that very reason. We also saw that 

(2 + 7=6) | 2 • 5 = 10 since 10 = (2 + 7=6) • (2 - 7=6). Can 
(2 + 7=6) j 2? If it does, then we can write 2 = (2 + 7-6) • (a + fi7= 6), 
where a and b are integers. But, multiplying out this product, and 
rewriting 2 as 2 + 0 ■ 7-6, we get 

2 + 0 • 7=6 = (2a - 6b) + (a + 2fi)7=6, 

and so 2a - 6b = 2 and a + 2b = 0. Solving for a and b we get a = | and 
b = - contradicting the fact that a and b are integers. Can 
(2 + 7 - 6 ) | 5? If it does, then we can write 5 = (2 + 7 = 6 ) ■ (a + f?7— 6 ), 
and, as above, 2a — 6b = 5 and a + 2b = 0. Solving for a and b we get 
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a = 1 and b = again a contradiction. Thus 2 + V-6 does not satisfy 
the property from Theorem 3.3. 

3.24 The number 3 is not irreducible because 3 = 3-3. Note that, in fact, 
you could keep “factoring” 3 forever in this way and never stop. 

On the other hand, if 3 \ab, then the product ab is either 0 or 3, and 
so a and b take on values from {0, 1, 3}, and it is clear that the property 
from Theorem 3.3 holds. 

3.28 r(«) is odd if and only if n is a square. The easy proof is to notice that 
the divisors of an integer n always come in pairs: d and r j. Hence the 
only way that x{n) can fail to be even is if for some divisor d — 'j, which 
happens if and only if d 2 = n. 

For the second proof, we use the canonical prime decomposition 

n = p{'p e 2 2 ■■■ Pl k , 

and we can see that the number of divisors is given by 
r(») = (<?i + l)(e 2 + 1) • • • (e* + 1), 

and this product is odd if and only if each term is odd. But each term is 
odd if and only if each of e\, e 2 , . . . , e* is even, that is, if and only if n is 
a square. 

3.29 Let n be the number we are looking for. If n has a single prime in its 
prime decomposition, then n = 2 a , where (a + 1) - 2 = 26, and n is a 
huge number. 

If n has two primes in its prime decomposition, then n = 2"3 fc , where 
(a + 1 ){b + 1) — 3 = 26, but 29 is prime, so this case is impossible. 

If n has three primes in its prime decomposition, then n = 2 a 3 b 5 c , 
where ( a + 1 )(b + l)(c + 1) - 4 = 26, so {a + 1 )(b + l)(c + 1) = 30. To 
make n as small as possible, we must choose c — 1. This means that 
(i a + 1 )(b + 1) = 15, so again, to make n as small as possible, we must 
choose a — 4, b = 2. Thus n = 2 4 3 2 5 1 = 720. This is the smallest 
possible n. 

If n has four primes in its prime decomposition, then n = 2 a 3 h 5 c 7 d , 
where ( a + 1 ){b + l)(c + 1 )(d + 1) - 5 = 26, so since 31 is prime, this is 
impossible. 

If n has five primes in its prime decomposition, then n — 2 a 3 fc 5 c 7 d ll < ’, 
where ( a + l)(i> + l)(c + 1 ){d + l)(e + 1) - 6 = 26, and this has a unique 
solution a = b = c = d = e= 1. Thus n = 2 l 3 l 5 l 7 l \\ l = 2310 
is also a positive integer with exactly twenty-six positive composite 
divisors. 
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Since having more than five primes is not possible, we are now ready 
to list the ten smallest, in order: 

2 4 3 2 5 = 720; 2 4 3 2 7 = 1008; 2 4 5 2 3 = 1200; 2 4 3 2 11 = 1584; 

3 4 2 2 5 = 1620; 2 4 3 2 13 = 1872; 3 4 2 2 7 = 2268; 

2 1 3 1 5 1 7 1 ll 1 = 2310; 2 4 7 2 3 = 2352; 2 4 3 2 17 = 2448. 

3.30 The state of the m th door changes — whenever k divides m — each time 
the jailer begins at the kth cell, turning his key in every kth door. 
Therefore, the state of the mth door will change exactly r (m) times, 
where r (m) is the number of divisors of m. Since the mth door was 
locked at the beginning, it will be unlocked at the end if and only if 

r (m) is odd, that is, if and only if m is a square (see Problem 3.28). 

3.31 The largest such number is 24 and it is divisible by 1, 2, 3, and 4. If n is 
such a number with n > 25, then 5 must also divide n, and we see that 
3 ■ 4 • 5 = 60 1 n. Thus n > 60 by property (8) of the basic divisibility 
properties (see Problem 3.12), and we can conclude that 7\n. So, 

3 • 4 • 5 ■ 7 = 420 1 r;, and now we know that 16 1 n, 9| n, 5|«, 7\n, 11 1 n, 13 \n, 
17 \n, and 19 1«; therefore, n > 232 792 560. We could continue, but by 
now you get the idea why 24 is the largest such number. 

3.32 Here are the first six partial quotients for rr : 

1 


7 + 


1 


15 + 


1 


1 + 


292 


1 + 


3.33 Zu’s rational approximation is given by 

1 


7 + 


1 


355 
Il3 ' 


1 

15 + I 


The reason this approximation is so accurate is that the next partial 
quotient is 292, which is a relatively large number. This means that 
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what we are ignoring, namely in the rest of the infinite 
continued fraction is quite small. 

3.35 The equation a = 2 + £ reduces to the quadratic equation 

a 2 - 2a - 1 = 0, and the positive solution to this equation is 1 + J2. 

3.36 If we let a be the number represented by this continued fraction, then 
a satisfies the equation a = 1 + which reduces to the quadratic 
equation or - a - 1 = 0, and the positive solution to this equation is 
the golden ratio h±+3. 

3.37 Letting a = 2001 and b = 1984, here are the steps. Since 2001 divided 
by 1984 yields a quotient of 1, we add the first two equations. Since 
1984 divided by 17 yields a quotient of 116, we multiply the third 
equation by 116 and add the result to the second equation. There are 
three more similar steps before we get to the final equation. 

a - 0 ■ b = 2001, 

0a-b= -1984, 
a-b = 17, 

116u- 117b = -12, 


817a — 824fi = 1. 


3.38 (a) 


1 + i 

1 

2 

1-i 

2 

1-i 

1 + i 

1 

1 - i 

1 + i 

1 

2 

1 

2 

1 -i 

1 + i 
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(b) 


1 + i 

2 

1 - i 

1 

1 

1 + i 

2 

1 - i 

2 

1 - i 

1 

1 + i 

1 - i 

1 

1 + i 

2 


(c) 


2 

1 — 2i 

1 + 2i 

1 

1 - i 

1 + i 

1 + 2i 

1 - i 

1 

1 + i 

2 

1 -2 i 

1 — 2i 

1 

1 + i 

1 - i 

1 + 2i 

2 

1-i 

1 +2i 

1 — 2i 

2 

1+i 

1 

1 

1 + i 

2 

1 + 2 i 

1 — 2i 

1 - i 

1 + i 

2 

1 -i 

1-2 i 

1 

1 +2i 


Chapter 4 

4.1 Since the number of apples must be divisible by 3, 5, and 8, this 
number must also be divisible by 120, by Theorem 3.2. But 


1 

3 


• 120 + 


1 

4 


120 + - ■ 120 + - ■ 120 = 40 + 30 + 24 + 
5 8 


and 109 + 10 + 1 = 120, so you started with 120 apples. (Note: this 
clearly is the only answer since for a larger multiple of 120, the number 
of apples given to the first four people would increase proportionally, 
whereas the number given to the last two people remains constant.) 

Using algebra we get the linear equation 


x = \x + \x + \x + ^x + 10 + 1. 
o 4 o o 


Thus 


~(l + l + l + l) X + n 


40 + 30 + 24 + 15 
120 


109 

* + 11 = 120 A + 11 ’ 


and llx = 11 • 120, and x = 120. 
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Note how similar these two methods are in the end, since you wind 
up computing 40 + 30 + 24 + 15 either way. 

4.2 Since 12 and 7 both divide his age, Diophantus must have died when 
he was eighty-four. 

4.4 We could, for example, take 20 and 250 as the given numbers. Then 
10 + x and 10 - x are the two unknown numbers, and 

(10 + x) 2 + (10 - x) 2 = 250. This reduces to lx 2 = 50, and so x = 5, and 
the two numbers are 15 and 5. 

If, in general, we let a be the given sum, and b the sum of the 
squares, then we seek to solve the simultaneous equations 

x + y = a and x 2 + y 2 = b. 

By writing y = a - x, we get x 2 + (a - x) 2 = b. This yields the quadratic 
equation lx 2 - 2 ax + ( a 2 - b) = 0, which has solution x = a± ' /2b -‘ l2 . 
Thus, if x is to be an integer, it is necessary for 2b - a 2 to be a square. 
This condition is also sufficient, since if 2b -a 2 is a square, it is odd or 
even according to whether a is odd or even, so a ± ~j2b - a 2 is even. 

4.5 Letting the two squares be x 2 and 9 - x 2 , we can choose m = 2, and set 
the second square equal to (2x - 3) 2 . So 4x 2 - 12x + 9 = 9 - x 2 . Then 
5x 2 = 12x, and x = f . Thus one square is and the other is 

9 - = §5, and 9 = (f ) 2 + (|) 2 , a sum of two squares. 

4.6 Simply multiply his solution through by 25 to get 


400 = 25 • 16 = 25 • ((f) 2 + (f ) 2 ) = 16 2 + 12 2 . 


4.10 


Using the hint, the perimeter is 2x 2 + 2x, and the sum of the perimeter 
and area is x 3 + 2x 2 + x, so we set 2x 2 + 2x = nfx 2 -, dividing through by 
x, and then solving for x yields x = 


Note that when the time comes we will need to choose m > \[2 to 
ensure that m 2 - 2 will be positive, and also choose m < 2 to ensure 
that x > 1 so that x 2 - 1 will make sense as a leg of a triangle. 

Substituting x = i nto x 3 + 2x 2 + x and simplifying we get 
(fJ'-2) 3 > an d this will be a cube if 2m 4 is a cube. Now, it is a simple matter 
of choosing m in the range \[2 < m < 2 in order for 2m 4 to be a cube. 


For example, we can let m — . In this case, x = 


470596 55149000000 55581297608 


235298 

14702 


, and we get 


14702 


216148804 


216148804 


for the three sides of the triangle. 


498 


Solutions 



Figure S4.11 1 3 + 2? + 3 3 + • • - + 
n 3 = t 3 . 

m 


Diophantus chose m = which yields * = §!§, and in turn 

W’ 2 470 0 8 5 9 5 » for the three sides of the triangle. 

4.11 (c) Induction is the most obvious way to prove this identity. 

Nonetheless, Figure S4.ll provides a visual proof by Solomon W. 
Golomb, Mathematical Gazette, 49 (May 1965), 199. 

4.12 The Euclidean algorithm yields 1 = 13 • 21 - 8 • 34; in other words, a 
solution to the equation is x = -8 and y = -13. So, by Theorem 4.1, all 
solutions are given by 

x = — 8-21 1, y = -13-346 

The continued fraction for is 


1 + 


1 + 


1 + 


1 + 


1 + 


1 + 


1 + 1 


and the convergents for this are ||.Thus 

34 • 13 - 21 • 21 = 442 - 441 = 1; and so another solution to the 
equation is x = 13 and y = 21. 
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The value of t above that produces this solution is t = -1, since for 
the a: value we get -8 + (-21)(-1) = 13, and for the y value, 

-13 - 34( — 1 ) = 21. 

This continued fraction is th e golden ratio ( see Problem 3.36). 

The numbers that appear as both the numerators and the 
denominators of the convergents form the Fibonacci sequence: 

1, 1, 2, 3, 5, 8, 13 

4.14 The first partial quotient q\ is 2, so now we invert the fractional part, 
and get = 75 + 2. Therefore, the second partial quotient g 2 is 4. 
The new fractional part is now (75 + 2) - 4 — 75 - 2. Inverting this 
fractional part, we get y— = 75 + 2, and we can see that all partial 
quotients from this point on will be 4, since the pattern is now simply 
repeating itself. Thus the continued fraction for 75 is 

75=2+ . 

1 

4+ 

1 


The sequence of convergents is , and the fraction 

H = 2.2361 correctly approximates 75 to four decimal places. 

Since 9 2 -5-4 2 = 81- 80 =1 and 161 2 - 5-72 2 = 25921 - 25920 =1, 
whereas 2 2 - 51 2 = 4- 5 = -l and 38 2 - 5 ■ 17 2 = 1444 - 1445 = -1, 
it appears likely that every other convergent (the even ones in the 
sequence) will provide solutions to a 2 - 5y 2 = 1. 

4.17 When we invert 73 - 1 , we get 

1 _ 1 73 + 1 73 + 1 

73-1 ~ 73- 1 ' 73 + 1 “ 2 ' 

Substituting 73 = f for one of the 73 terms yields the expression 

1 73 + 1 

!-i ~ 2 ' 

Solving for 73 produces the “smaller” fraction 73 = . 

4.18 (a) a = 2 - 3 1, y = 2- lOf; (b) a = — 2 + 35 1, y = 1 - lit; (c) no 
solution, since gcd(12, 15) = 3, and 3| 20. 
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4.20 There are three solutions: 71 cows and 9 horses; 40 cows and 30 horses; 
and 9 cows and 51 horses. 

4.21 Let x , y, z be a solution and, without loss of generality, assume 

x <y <z. Then, (1 + |) 3 > 2, since l + |>l + i>l + |.It follows 
that x < 4. Also, note that * > 1, since 1 + | = 2. Therefore, we can 
conclude that x = 2 or x = 3. 

First, we consider the case X = 2. In this case, we have 

( 1 + i )( 1 + i ) = !> 

and, since y < z, we get (1 + |) 2 > |. It follows that v < 7, that is, 
y = 2, 3, 4, 5, or 6. 

For y = 2 and v = 3, when we solve for z in the above equation, we 
do not get a positive integer value for z. However, for y = 4, we get 
(1 + \) = |f, and so z = 15 is a solution. For y = 5, we get (1 + |) = 
and so z = 9 is a solution. And for y = 6, we get (l + |) = |, and so 
z = 7 is a solution. 

Next, we consider the case x = 3. In this case, we have 

(! + *)(! + *) = §. 

and, since y < z, we get (1 + |) 2 > It follows that y < 5, that is, 
y = 2, 3, or 4. 

For y = 2, there is no solution. For y = 3, we get (1 + |) = |, and so 
z = 8 is a solution. For y = 4, we get (1 + \) = f , and so z = 5 is a 
solution. 

In summary, the set of all solutions is given by the following triples 

(X y, z): 

(2, 4, 15), (2. 5, 9), (2, 6, 7), (3, 3, 8), (3, 4, 5). 


Chapter 5 

5.3 Suppose m = a 2 + b 2 and n = c 2 + d 2 are each a sum of two squares. Let 
zbe the complex number defined by z = (a + bi)(c + di). Then 

|z| = | (a + bi)(c + di) \ = \(a + bi)\ ■ \{c + di) \ = \] a 2 + b 2 ■ V c 2 + d 2 . 
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Therefore, 


z 2 = (a 2 + b 2 ){c 2 + d 2 ) = mn, 


and nm is a sum of squares since for the complex number z = e + fi 
where e and f are integers, \z\ 2 = e 2 + f 2 is automatically a sum of 
squares. 

Next, in order to write 65 as a sum of two squares, we write 
z = (1 + 2i)(2 + 3/) = -4 + 7i, and so 65 = |z| 2 = (-4) 2 + 7 2 = 4 2 + l 2 . 

5.4 Since this asks us to solve x 2 + 5 = y 2 and x 2 - 5 = z 2 , this is Problem 
1.19, so the solution isx = 

5.5 Since 89 is prime, we seek a primitive Pythagorean triple {x, y, 89} 
where x 2 + y 2 = 89 2 . By Theorem 1.1, we know that 89 = s 2 + t 2 , but 
since 89 can be written as a sum of two squares in only one way by 
Theorem 5.1, there can only be one such triangle. Setting s = 8 and 

t = 5, we get x = 2s t — 80, and y = s 2 - t 2 = 39. The desired triangle 
has sides 39, 80, 89. 

5.6 (a) By Theorem 1.1, n = s 2 + t 2 where s and t are relatively prime. Then, 
n is odd, and, by Theorem 5.3, n is a product of primes of the form 

4k + 1. 

(b) We do this recursively. To express 5 3 in the form s 2 -f t 2 we use 
the fundamental identity of Fibonacci to compute 

5 3 = 5 2 • 5 = (3 2 + 4 2 )( 1 2 + 2 2 ) = ll 2 + 2 2 = 5 2 + 10 2 . 

Since we want s and t to be relatively prime we choose 5 3 = ll 2 + 2 2 . 
Now, to express 5 3 • 13 in the form s 2 + t 2 we repeat this idea, getting 

5 3 • 13 = (ll 2 + 2 Z )(2 2 + 3 2 ) = 28 2 + 29 2 = 16 2 + 3 7 2 . 

Here, either choice for s and t is fine, so we conclude that 1625 is the 
hypotenuse for two primitive Pythagorean triangles {57, 1624, 1625} 
and (1113, 1184, 1625}. 

5.8 These three terms produce remainders {1, 3, 9}. Therefore, the other 
sets of remainders become {2, 6, 5}, {4, 12, 10}, and {7, 8, 11}. 
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5.9 For 31 we can write 

2 341 - 2 = 2(2 340 - 1) = 2(2 S - 1)(2 335 + 2 330 + • • • + 1), 

so 31 1 (2 341 -2). 

For 11, we write 

2 341 - 2 = 2(2 340 - 1) = 2(2 10 - 1)(2 330 + 2 320 + • • • + 1), 

so 11 j (2 341 - 2), by Fermat’s little theorem. Thus, 341 1 (2 341 - 2). 

You might have noticed here that, since 2 10 - 1 = (2 s - 1 )(2 S + 1), it 
is true that both 11 and 31 divide 2 10 - 1; hence 341 j(2 10 - 1). Therefore, 
it is possible to give a shorter proof than we gave above merely by 
observing that 2 10 — 1 = 1023 = 3 • 341 (without using Fermat’s little 
theorem) and showing as we did that (2 10 - 1)|(2 341 - 1). 

5.10 If we replace the 34 by -5 in 34 2 + l 2 we get (-5) 2 + l 2 , so we can write 

13 • 2 = (-5) 2 + l 2 . 


Thus 


13 2 • 2 • 89 = (34 2 + l 2 )((-5) 2 + l 2 ) = (-169) 2 + 39 2 , 
which, dividing by 13 2 , simplifies to 

2 • 89 = 13 2 4- 3 2 . 

Again we now have a smaller multiple of 89 written as a sum of two 
squares, and we can repeat the process. 

We find the remainders of 13 and 3 when divided by 2, which are 
both 1, and we replace 13 and 3 by 1 in 13 2 + 3 2 to get l 2 + l 2 . Then, 
since l 2 + l 2 is a multiple of 2, we write 

2 - 1 = 1 2 + 1 2 . 


Finally, 


2 2 • 89 = (13 2 + 3 2 )(1 2 + l 2 ), 


and 


2 2 • 89 = (13 ■ 1 + 3 • l) 2 + (13 • 1 - 3 ■ l) 2 = 16 2 + 10 2 = (2 • 8) 2 + (2 • 5) 2 , 
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which, dividing by 2 2 , simplifies to 

89 = 8 2 + 5 2 . 

5.11 x 6 = 1 (mod 13) is satisfied for x = 1, 3, 4, 9, 10, 12. (Note that x = a 
satisfies this congruence if and only if x = 13 - a also satisfies this 
congruence.) 

5.13 Suppose 89 = a 2 + b 2 . Since 13 • 89 = 34 2 + 1, we write 

34 2 b 2 = (13 ■ 89 - 1)(89 - a 2 ), and 89|(34 2 b 2 - a 2 ) = (34 b - a)(34b + a). 
Thus 89 1 (34i> - a), or 89 1 (34b + a). We may as well assume 89|(34b - a). 
Now, we use the fundamental identity of Fibonacci to write 

89 2 = (a 2 + b 2 )( 5 2 + 8 2 ) = (5 a + 8b) 2 + (8a - 5b) 2 . 

But, in turn, since a = 34b - 89b, this can be rewritten as 

89 2 = (2 ■ 89b - 5 ■ 89b) 2 + (3 • 89b - 8 ■ 89b) 2 , 

which simplifies to 

1 = (2b - 5b) 2 + (3b - 8b) 2 . 

Thus we have four possible solutions to consider, all corresponding to 
1 = (±1) 2 + 0 2 . Each of these gives us a system of simultaneous 
equations to solve; so, for example, one possibility is that 2b - 5b = 1, 
and 3b - 8b = 0. This yields b — 8 and b = 3, which in turn means that 
a = 5, as desired. The other three cases are similar. 

5.14 Assume a n - 1 is prime. If a > 2, then 

a" — 1 = (a - l)(a” -1 + a" -2 + • • • + 1), and a n - 1 is composite, which is 
a contradiction. Therefore, a = 2. 

If d\n, then 2" - 1 = (2 d - \)(2"~ d + 2 n ~ 2d + • • • + 1). But 2 n - 1 is 
prime, so d = 1 or d = n. Therefore, n is prime. 

5.17 Since N is odd and a sum of two squares, and since 2 and 3-5 - 7 ■■■ pk 
are relatively prime, then, by Theorem 5.3, all prime divisors of N are of 
the form p = An + 1 . Such a prime divisor of N cannot be one of the 
primes less than or equal to pk, otherwise, it would also divide 2 2 , 
hence it must be greater than pk. Therefore, there is no largest prime of 
the form p = An + 1 . 
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5.19 Since 120 = 2 3 • 3 • 5, it has 15 = (4 • 2 • 2 — 1) proper divisors, and their 
sum is 1 + 2 + 3 + 4 + 5 + 6 + 8+ 10 + 12 + 15 + 20 + 24 + 30 + 

40 + 60 = 240. 

Since 672 = 2 s • 3 • 7, it has 23 = (6 ■ 2 • 2 — 1) proper divisors, and 
their sum is 1 + 2 + 3 + 4 + 6 + 7 + 8 + 12 + 14 + 16 + 21 + 24 + 28 + 
32 + 42 + 48 + 56 + 84 + 96 + 112 + 168 + 224 + 336 = 1344. 

5.20 This number has seventy-seven digits! The solution is mostly a matter 
of bookkeeping. When you do the sum for the prime 616 318 177, you 
get 1 + 616 318 177 — 2 • 7 3 • 898 423, so you record that one prime 
898 423 showed up, three 7s showed up, and one 2 showed up. You do 
this for each prime. So, for 112 303, since 1 + 112 303 = 2 4 • 7019, you 
record that 7019 showed up and that four more 2s showed up. That’s all 
there is to it. 

Note that for 2 36 , the sum of the divisors is 2 37 - 1 = 223 • 616 318 177, 
which makes it clear that Frenicle constructed this problem to make 
use of Fermat’s factorization of 2 37 - 1. After all the bookkeeping is 
done, you will have one extra 2, and one extra 3, which means the sum 
of all the divisors is six times the number, so the sum of the proper 
divisors is five times the number, as claimed. 

5.22 For the first number, 2 tt+1 (18 • 2 2n - 1), the individual sums will be 
1 + 2 + • ■ ■ + 2” +1 = 2"+ 2 - 1, and 1 + (18 ■ 2 2 " - 1) = 18 • 2 2n , so the 
sum of the proper divisors is 


(2"+ 2 - 1)(18 ■ 2 2 ") - 2' ,+1 (18 ■ 2 2 ” - 1), 


which can be reduced to 


2" +1 (18 • 2 2 " - 9 • 2" + 1), 


which is the second number. 

For the second number, 2" +1 (3 • 2” - 1)(6 • 2 n - 1), the individual 
sums will be 1 + 2 + • • • + 2" +1 = 2"+ 2 - 1, 1 + (3 • 2" - 1) = 3 • 2”, and 
1 + (6 ■ 2" — 1) = 6 ■ 2”, so the sum of the proper divisors is 

(2»+ 2 _ i)(3 . 2”)(6 ■ 2”) - 2" +1 (3 • 2" - 1)(6 • 2" - 1), 
which reduces to the first number, 2 ,,+1 (18 • 2 2n - 1). 
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5.23 Assume that k had an odd divisor d > 1. Then, we can factor a k + 1 as 
follows: 


a k + 1 = (flS + l)(a k ■* - a k + a k - ■ • • + 1). 

It is the fact that d is odd that makes the alternating signs work out 
for the d terms in the right-hand expression. Note also that this is just 
the same as x + 1 being a factor of x d + 1 when d is odd — which we 
know to be true not only because the alternating signs work out, but 
also because -1 is a solution of x d + 1 = 0 when d is odd. 

5.26 First, we show that 5 |(2 2 "- 1), for all n > 2. But this follows by 
induction, since 5 |(2 22 - 1), and since 

2 2 " +1 — 1 = (2 2 ") 2 - 1 = (2 2 "— 1)(2 2 "+ 1). 

Now we show that, for all n > 2, any two consecutive Fermat 
numbers are congruent modulo 10; hence they all end in the same 
digit. But the difference between two consecutive Fermat numbers is 
just 2 2 " 4 ' - 2 2 " = 2 2 "(2 2 "- 1), and we have just shown that this is 
divisible by 5, and since it is also clearly divisible by 2, it must be 
divisible by 10, as desired. 

Alternately, you can argue by induction that for n > 2, 2 2 " ends in 6. 
Obviously, 6 2 = 36 ends in 6; moreover, whenever 2 2 " ends in 6, then it 
follows that 2 2,,t ' = (2 2 ”) 2 also ends in 6. 

5.29 (a) Since To = F i — 2, the formula holds for n = 0. Assuming the 

formula holds for an integer n > 0, then 


foFlfz - • - F„F„ +1 = (F„, -2)F„ +1 = (2 2 "'+ 1 - 2)(2 2 "“+ 1) 

= (2 2 ”*' - 1) (2 2 " , + 1) = (2 2 “- 1) = F„ 2 - 2, 

and the formula holds by induction. 

(c) Suppose that a positive integer d divides two Fermat numbers F m 
and F n where m < n. Then, since 


F„ = F 0 F\F 2 ■ ■ ■ F rn ■ ■ ■ T„_x + 2, 

we conclude that d \2. But F m and F„ are odd, so d / 2; therefore, d = 1, 
and F, n and F n are relatively prime. 
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5.32 Since squares must have the form 4 n or 4n + 1, the difference of two 
squares cannot have the form 4 ft + 2. Any other number can be written 
as a difference of two squares. 

For an even number 4 n (that is, for an even number not of the form 
4 n + 2), we can write 4 n = (n + l) 2 — (n— l) 2 . 

For an odd number 2n+ 1, we can write 2n + 1 — (n + l) 2 — n 2 . 

5.33 In this case n = [V2 027 651 281 J = 45 029, so we begin the sequence 
by computing 45 030 2 - 2 027 651 281 = 49 619, which is not a square. 
For the next term, we add 2 ■ 45 030 + 1 = 90 061, and get 139 680, 
which again is not a square. We successively add 90 063, 90 065, 

90 067,. . . , until finally at the eleventh step we add 90 081 to get 
1 040 400, which is a square, 1020 2 . Therefore, 2 027 651 281 = 

45 041 2 - 1020 2 = 44 021 • 46 061. 

5.34 For 89 you simply eliminate all other possibilities by trying to write 89 
as a sum of five or fewer numbers in the following way. You can begin 
by assuming 70 is in the sum; then 19 needs to be summed with four or 
fewer numbers. Clearly one of them must be 12, so then 7 needs to be 
summed with three or fewer numbers, which can be done in only one 
way, 7 = 5 + 1 + 1. So, if 70 is in the sum, then five numbers are 
required. Next, you assume 70 is not in the sum, but that 51 is in the 
sum. Proceeding in this way yields the following representations: 

89 = 70 + 12 + 5 + 1 + 1 = 51 + 35 + 1 + 1 + 1= 35 + 22 + 22 + 5 + 5 = 
22 + 22 + 22 + 22 + 1 . 

The other four numbers that require five pentagonal numbers in 
their representation are 21, 31, 43, and 55. 

5.35 The seventh row in his diagram reads 1, 7, 21, 34, 35, 21, 7, 1. Clearly, 
the 34 was just a careless mistake, and should be 35. 

5.38 (a) Choosing r elements from a set of n elements is equivalent to 

“choosing” a set otn-r elements in the set not to be 
chosen. 

(b) We need to show that for any r * 1 (r-l) < (") . But, by the formula 
for the binomial coefficient in Theorem 5.7, we have 


(n\ f n \ n — r + 1 

W = V-V ~ 

and n ~ r r +1 > 1 , since r < . 
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5.39 The first formula holds for w = 1: (|) = 1 = l 2 . Now, assume that this 
formula holds for some integer n > 1. Then 

l 2 + 3 2 + • • • + n z + (n + 2) 2 = + (n + 2) 2 


in + 2 ){n + 1 )n 6 (n + 2) 2 

~6 + 6 


{n + 2 )in 2 + n + 6n + 12) 
6 


in + 2)(« 2 + 7n+ 12) (« + 2)(« + 3)(n + 4) fn + 4\ 


and so, the formula holds for all n. 

The proof of the second formula is almost identical. 

5.40 (a ) r!(f) = = PiP ~ l)(p — 2) • • • ip - r + 1), so p |r!(f). But p and 

r \ are relatively prime, so by Euclid’s lemma, Theorem 3.3, p \ (^). 

(b) The easiest way to do this is to say that for a < 0, there is an integer 
b > 0 such that a = b (mod p). Thus a " = b n = b = a (mod p). 

Alternatively, you can argue that a' 1 = a (mod p) clearly holds for 
a = 0. Suppose a < 0. Then, — a > 0, and so, by what we just proved in 
part (a), 


i~a) p = -a (mod p). 

There are now two cases to consider. If p is an odd prime, then 
a p = ~i~a) p = -(-a) = a (mod p), 

as desired. 

If p — 2, then a = -a (mod p), since 2|(a + a). So, 


a p = a 2 = (-a) 2 = -a = a (mod p), 


and this completes the proof. 
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5.41 We can use (®) to illustrate the general argument. By repeated use of 
Theorem 5.6, we get 



The last step, replacing Q by Q, is purely a matter of esthetics to make 
the formula nicer, and is true since both terms equal 1. 

5.47 (a) S is finite because x, y, z are positive integers that are restricted to 

the intervals 1 < * < yp, 1 < y < |, and 1 < v < f . 

(b) If x = y - z, then p = x 2 + 4 xy = (y - z) 2 + 4vz = (y + z) 2 , which is 
impossible, since y + z > 1 and p is prime. If * = 2 y, then 

p = (2y) 2 + 4yz, and 4| p, which is also impossible. 

(c) For (x, y, z) e A, the question is whether (x + 2 z, z, y - x - z) 
satisfies the equation x 2 + 4 yz = p. But 

(x + 2z) 2 + 4z(y - x - z) = x 2 + 4 xz +4 z 2 + 4zv - 4 zx - 4 z 2 

= x 2 + 4 yz = p, 

so f(x, y, z) e 5. 

For (x, y, z) € B, the question is whether (2y - x. y, x - y + z) 
satisfies the equation x 2 + 4yz = p. But 

(2y - x) 2 + 4 y(x - y + z) = 4y 2 - 4yx + X 2 + 4yx - 4.y 2 + 4yz 

— x 2 + 4 yz = p. 


so f(x, y, z) e S. 

For (x, y, z) e C, the question is whether ( x - 2 v, x - y + z, y) 
satisfies the equation x 2 + 4yz = p. But 

(x - 2y) 2 + 4(x - y + z)y = * 2 - 4xy + 4v 2 + 4xy - 4 v 2 + 4zy 

— x 2 + 4 vz = p, 

so f(x, y, z) e S. 

(d) For (x, y, z) e A, we have f(x, y, z) = (x + 2z, z, y - x - z), and it is 
clear that x + 2 z > 2 z, so f{x, y, z) e C. 
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For (x, y, z) e B, we have fix, y, z) = (2 y - x, y, x — y + z), and it is 
clear that 2y - x < 2y and also that 2 y — x > y — (x — y + z), since 
0 > -z, so f{x, y, z) e B. 

For ix, y, z) e C, we have fix, y, z) = (x - 2y, x - y + z, y), and it is 
clear that x - 2y < ix - y + z) - y, so fix, y, z) e A. 

(e) For (x, y, z ) e A, we have fix, y, z) = (x + 2z, z, y - x - z); and, since 
this element is in C, we see that 


fix + 2z, z, y — x-z) — (x + 2z - 2z, x + 2 z-z + y- x-z, z) = (x, y, z). 


For (x, y, z) e B, we have fix, y, z) = (2y - x, y, x - y + z); and, since 
this element is in B, we see that 


f(2y — X, y, x-y+z ) = (2y-(2y-x), y, (2 y-x) - y + (x-y + z )) 
= (x, y, z). 


For (x, y, z) e C, we have fix, y, z) = (x - 2 y, x-y + z, y); and, since 
this element is in A, we see that 


fix — 2y, x-y+z, y) = (x-2y + 2v, y, ( x-y + z)-(x-2y) - y) 

= ix, y, z). 

Therefore, the inverse of f is f. 

(f) Since the function f takes elements in A to C, and elements in C to 
A, any fixed points of f must be in B. So, if (x, y, z) is a fixed point, 
then, since (x, y, z) e B, we have 

(x, y, z) = fix, y, z) = (2y - x, y, X - y + z). 


Therefore, x = 2y - x and z = x - y + z, and we conclude that x = y. 

But (x, y, z) € S, so (x, y, z) satisfies the equation x 2 + 4yz = p, 
which means, since x = y, that it satisfies x 2 + 4xz = p. Since p is 
prime, this means x = 1; and so, 1 + 4z = p, that is, z = This, by 
the way, is where we use the fact that p = 1 (mod 4). So, f has a single 
fixed point (1,1, £ ^-). 

(g) If (x, y, z) e S, then x 2 + 4yz = p, so since (x, z, y) satisfies this same 
equation, it follows that£(x, y, z) e S. 

(h) For p = 29, (1, 7, 1) pairs with (1, 1, 7); (3, 5, 1) pairs with (3, 1, 5); 
and the element (5, 1, 1) is such that y = z. So we get 5 2 + (2 • l) 2 = 29. 
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Chapter 6 

6.1 In this case, the numbers 5, 2 • 5, 3 ■ 5 16 -5 become, as expected, 

the numbers from 1 to 16: 

5, 10. 15, 3, 8, 13, 1, 6, 11, 16, 4, 9, 14, 2, 7, 12. 


6.2 Euclid’s algorithm produces 1 = 2 • 16 + (-1) • 31, so x — 2 • 13 = 26 is 
the solution. 

This could also be solved by multiplying the congruence by 2, 
yielding 32x = 26 (mod 31), which reduces to x = 26 (mod 31). 

The method of letting x run through all values 0, 1, 2, . . . , 30 is just 
a case of bad luck in this instance, since you have to wait until you get 
to x = 26 for 16x to be congruent to 13. 


6.3 


(a) If x is a solution to the congruence ax = b (mod m), then m\ax - b, 
so my — ax - b, for some integer y. Therefore, x, y is a solution to the 
equation ax - my = b. 

Conversely, if x, y is a solution to the linear Diophantine equation 
ax - my — b, then my = ax - b, and so m|ax - b, and ax = b (mod m). 
Therefore, x is a solution to the congruence ax = b (mod m). 

(b) By Theorem 4.1 , if d is the greatest common divisor of a and m, then 
the equation ax - my = b has a solution if and only if d\b. Therefore, by 
part (a), the linear congruence ax = b (mod rri) has a solution if and 
only if d\m. 

Now, if x is a solution for the congruence, then x + rj f is also a 
solution, for any integer r, as we can see, since a(x + r -f) = ax + (| r)m = 
ax = b (mod m). (Note that the fact that d\m is what makes x + ™ an 
integer for any r, and the fact that d\a is what allows us to conclude 
that (| r)m = 0 (mod m) in the last step.) 

Therefore, if x is any solution, then 




x + 


2m 
d ’ 


x + 


3 ill 
d ' 


(d—\)m 

d 


are d distinct solutions modulo m. These are clearly distinct solutions 
because the difference between any two of them is less than m. 

We still need to show that these are all the solutions. It suffices to 
show that the x value of any of the infinitely many solutions provide 
by Theorem 4.1 is congruent to one of these d solutions. But if x + t(™) 
is one of these solutions for ax - my = b, then we can write t = qd + r, 
where 0 < r < d, and 


x + t(f ) = x + (qd + r)(f ) = x 4- (^) (mod m), 


as desired. 
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6.5 (a) The solution modulo 30 for the third congruence has to be x = 3 

or x = 18. Since neither of these is a solution for either of the other two 
congruences, there is no simultaneous solution. 

(b) The solution modulo 30 for the third congruence has to be x = 2 or 
x = 17. Since x = 17 satisfies the other two congruences, but x = 2 
doesn't, x = 17 is the unique solution. 

(c) A system of congruences will have a solution if and only if, for each 
pair, gcd(m,, rrij) | u, - a,; when it exists, the solution is unique modulo 
the least common multiple of the moduli. 


6.6 In the first version of the story, we are asked for a simultaneous 
solution for the congruences 


x = 1 (mod 2); 
x = 1 (mod 5); 


x = \ (mod 3); 
x = 1 (mod 6); 


x = 1 (mod 4); 
x = 0 (mod 7). 


In this system of congruences, we can eliminate the congruence 
modulo 2, since if x satisfies the congruence x = \ (mod 4), then x will 
also satisfy the congruence x = 1 (mod 2). Similarly, we can also 
eliminate the congruence modulo 6, since if x satisfies the congruences 
x = 1 (mod 4) and x = 1 (mod 3), then x will also satisfy the 
congruence x = 1 (mod 6) (this is because x is odd and cannot satisfy 
the congruence x = 4 (mod 6)). 

Therefore, we can deal with the reduced system 


x = 1 (mod 3); x = 1 (mod 4); x = 1 (mod 5); x = 0 (mod 7). 


Thus m = 3 • 4 • 5 • 7 = 420, and 


M] =4-5-7, M z = 3-5-7, M 3 = 3-4-7, M 4 = 3 • 4 • 5. 


The first congruence, 140x = 1 (mod 3), reduces to 2x = 1 (mod 3), 
and x = 2; thus N\ — 2. Next, 105x = 1 (mod 4) has an easy solution 
x = 1; thus N 2 = 1. The third congruence, 84x = 1 (mod 5), we reduce 
to 4x = 1 (mod 5), and x = 4 is the solution; thus N 2 = 4. The final 
congruence, 60x = 1 (mod 7), reduces to 4x = 1 (mod 7), which also 
has an obvious solution x = 2; thus 1V 4 = 2. 
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Therefore, the solution for the first version of the story is given by 

x = 1 • 140 • 2 + 1 ■ 105 • 1 + 1 ■ 84 • 4 + 0 • 60 • 2 = 721 = 301 (mod 420). 

Hence the smallest number of eggs the woman could have had in her 
basket was 301. 

6.8 18! + 1 = 6 402 373 705 728 001 = 19 • 336 967 037 143 579 . 

6.9 The “trick” here is to realize that (p - 1)! is just a product of the odd 
numbers interlaced with the even numbers, but if you take the even 
numbers backwards from p - 1, they are just the odd numbers again 
(modulo p) with negative signs. Thus 

(p - 1)! = 1 ■ 3 • 5 • ■ ■ (p - 2) • (-(p - 2)) ■ • • (-5) ■ (-3) • (-1) 

= l 2 • 3 2 • 5 2 • • • (p — 2) 2 • ( — 1 ) ^ (mod p). 

But, by Wilson’s theorem, (p — 1)! = — 1 (mod p), so we conclude that 

l 2 -3 2 - 5 2... ( p- 2 )2 = (-i)^ (mod p). 

We can prove the identity for the squares of the even numbers in the 
same way, or we can observe that ((p - l)!) 2 = (-1) 2 = 1 (mod p),and 
since the product of the odd squares has been determined to be ±1 , the 
product of the even squares must also be ±1 with the same parity. 

6.11 Let a be a primitive root for the prime p. Then, since we know that 

p - 1 is the least exponent for which a p ~ l = 1 (mod p), we conclude 
thatu^^l (mod p). Therefore, = — 1 (mod p), so we can write 

(p - 1)! = a ■ a 1 ■ a* ■ ■ ■ a p ~ l = fl i+2+3+-+(p-i) ( mod p ) 

= a 2 = (a 2 J =(_i)P = _i (mod p). 

6.12 (1) x = 12 is a solution, sox = -12 = 17 (mod 29) is also a solution. 

(2) 2 7 will be a solution since 7 = Thus, 2 7 = 128 = 12 (mod 29) 
is a solution; and hence, again, 17 is also a solution. 

(3) 14! = 12 (mod 29), so x = 12 and x = 17 are solutions. 

Incidentally, these solutions are easily checked, since 29 divides 

both 12 2 + 1 = 145 and 17 2 + 1 = 290. 
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6.14 Assume, by way of contradiction, that p\ a. Then, by Theorem 6.1, a 
has an inverse modulo p; that is, there is an integer c such that ac = 1 
(mod p). So, we can multiply the congruence 

a 2 + b 2 = 0 (mod p) 

by c 2 , and get 

1 + (be) 2 = 0 (mod p). 

But this contradicts Theorem 6.4, since p is of the form 4 n + 3. Hence 
p\a. Similarly, p\b. 

6.15 (a) Assume that p x , p 2 , p k are all the primes of the form 4 n + 3. Let 

N = 4 • pi • p 2 • • • pk - 1. Then, N is of the form 4 n + 3, that is, 

N = 4 ■ pi ■ p 2 ■ ■ ■ pk - 1 = 3 (mod 4). Suppose the prime factorization 
of N = qi ■ q 2 ■ ■ ■ qj consists entirely of primes q t of the form An + 1 . 
Then N = 1 • • • 1 = 1 (mod 4) is a contradiction to N = 3 (mod 4). 
Therefore, not all of the primes q, can be of the form An + 1. 

Therefore, N is divisible by a prime q of the form An + 3, but q 

cannot be any of the primes p \ , p 2 p k , since otherwise q would 

divide 1. This contradicts the assumption that pi , p 2 , .... p k are all the 
primes of the form An + 3; hence there are an infinite number of primes 
of this form. 

6.16 (c) By Fermat's little theorem, any a e {0, 1 , 2, . . . , p - 1 } is a zero 
for the congruence xP - x = 0 (mod p). So we can use the same 
argument used in the proof of Lagrange's theorem, and factor xP - x 
into a product of p linear factors: x(x - 1) • • • (x - (p - 1)). 

(d) In the expression in part (c), the coeficient of x on the right-hand 
side is — 1 . On the left-hand side, the coefficient of x is given by 
( — 1)(— 2) • • • (p - 1). Since p - 1 is even, the number of minus signs 
that contribute to the coefficient of the x term will be even. Therefore, 
the coefficient of x on the left side is just (p - 1)!. Hence (p - 1)! = -1 
(mod p). 


Chapter 7 

7.1 Euler (OY ler), Aryabhata (AHR yuh BHUT uh), Bachet (bah SHAY), 
Erdos (AIR dish), Fermat (fair MAH), Gauss (gouse — rhymes with 
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grouse), Lagrange (lah GRAHNZH), Thabit ibn Qurra (THA bit I ben 
koo rah), Leibniz (LYPE niz), Mersenne (mair SEN), and Zu (dzoo). 

7.2 (a) 1, 3, 5, 9, 11, and 13 is the reduced residue system, and 

5, 15 = 1, 25 = 11, 45 = 3, 55 = 13, 65 = 9 (mod 14) 

is the same reduced residue system in another order. 

(b) 5 6 = 15 625 = 1 (mod 14). 

(c) 5 2 = 25 = 11 (mod 14) and 5 3 = 125 = -1 (mod 14); therefore, the 
order of 5 is 6. 

7.3 Since f is multiplicative we have 

n d= m-i)= mm. 

Therefore, x = f( 1) is an integer that satisfies x = x 2 , so * = 0 or x = 1. 
Since f(l) is positive, f( 1) = 1. 

7.4 Since, in this case, r = 1,5 ands = 1, 2, 3, 4, there are eight 
conguences to solve of the form 

x = r (mod 6) and x = s (mod 5). 

For r = 1, the solutions for x = 5 (mod 6) are 1,7,13,19,25. Therefore, 
we get the following four simultaneous solutions: 

forx = l (mod 6) and x = l (mod 5), we get x=l; 

for * = 1 (mod 6) and x = 2 (mod 5), we get x=7; 

forx = l (mod 6) and x = 3 (mod 5), we get x=13; 

forr s 1 (mod 6) and x = 4 (mod 5), we get x= 19. 

For r = 5, the solutions for x = 5 (mod 6) are: 5,11,17,23,29. 
Therefore, we get the following four simultaneous solutions: 

forx = 5 (mod 6) and x = l (mod 5), we get x=ll; 

for x s 5 (mod 6) and x = 2 (mod 5), we get x=17; 

for x = 5 (mod 6) and x = 3 (mod 5), we get x=23; 

forx = 5 (mod 6) and x = 4 (mod 5), we get x=29. 
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The eight systems of congruences, therefore, produce precisely eight 
solutions, 1, 7, 11, 13, 17, 19, 23, 29, and these are the eight numbers 
less than and relatively prime to 30. 

7.5 (a) Suppose x\ and x 2 are solutions in the set that are not distinct 

modulo ab; that is, x x = x 2 (mod ab). So ab\x x - x 2 . But then a\x\ - x 2 
and b\x x - x 2 , so x x = x 2 (mod a) and x\ = x 2 (mod b). This means x x 
and x 2 are solutions to the same system of congruences; that is, x x — x 2 . 
Therefore, the solutions are distinct. 

(b) If v is a solution to one of the systems of congruences, then x is 
relatively prime to a, and x is relatively prime to b, and so, by Euclid’s 
lemma, Theorem 3.3, x is relatively prime to ab. 

(c) If x is a positive integer less than and relatively prime to ab, then * is 
also relatively prime to both a and b. Therefore, by the division 
algorithm, x = q x a + r, for some integers q x and r where r is relatively 
prime to a and 0 < r < a, and also x = q 2 b + s, for some integers q 2 and 
s where 5 is relatively prime to b and 0 < s < b. Hence x is a solution to 
the system of congruences 

x = r (mod a) and x = s (mod b). 


as desired. 


7.6 (a) 64 = 4 • 4 • s/A ■ Vi = 4 • 4 • 4 = \/\/V 4 4! 

(b) 64 = </>(0(4 4 )). 

7.8 For n = 36, we get 


0(1) + 0(2) + 0(3) + 0(4) + 0(6) + 0(9) + 0(12) + 0(18) + 0(36) 
= 1 + 1+ 2 + 2 + 2 + 6 + 4 + 6 + 12 = 36. 


We will explain the general proof by continuing with the example for 
n = 36. For each divisor d we list the numbers from 1 to 36 whose gcd 
with 36 equals d, and also compute 0(^) for each d. 


1 : {1. 5, 7, 11, 13, 17, 19, 23, 25, 29, 31, 35), 0(f) = 0(36) = 12 

2 : (2, 10, 14, 22, 26, 34), 0(f) = 0(18) = 6 


516 


Solutions 


3 : {3, 15, 21, 33), 0(f) = 0(12) = 4 

4 : {4, 8, 16, 20, 28, 32), 0(f) = 0(9) = 6 
6: (6, 30), 0(f) =0(6) -2 

9: (9,27), 0(f) =0(4) = 2 
12: {12, 24), 0(f§) =0(3) = 2 

18: {18), 0( f|) = 0(2) = 1 


36: {36), 0(f) =0(1) = 1 

Now, there are two ideas in this proof we need to verify. The first is 
almost obvious: as we go down this list and let d range through the 
divisors of 36 from 1 to 36, we also have the same divisors of 36 ranging 
going up on the right-hand side within the expressions 
0(1), 0(2), 0(3), 0(4), 0(6), and so on. In other words, 

d\n d\n 


simply because as d ranges over all the divisors of n, so does (j. 

The second idea that needs verifying is why for a given divisor d 
there are exactly 0(f) numbers from 1 to 36 whose gcd with d equals d. 
If a is a number from 1 to 36 such that gcd(a, d) - d, then d\a, and 

gcd(2, f ) = 1. So there are exactly 0 (f ) numbers whose gcd with d 

equals d, because that’s how many numbers less than f are relatively 
prime to f . 

For example, for d = 4, 0(f) = 0(9) = 6, and the six numbers 
relatively prime to 9 are 1, 2, 4, 5, 7, and 8, which correspond exactly 
to the six numbers 4. 8. 16, 20, 28, and 32 from 1 to 36 whose gcd with 
36 equals 4, since 

4 = 1-4, 8 = 2-4, 16 = 4 • 4, 20 = 5 • 4, 28 = 7 • 4, and 32 = 8 • 4. 

7.9 2 is a primitive root for p = 11. 2 is also a primitive root for 13, but 2 is 
not a primitive root for q = 17; however, 3 is a primitive root for 17. 

7.10 0(10) = 4, and 3 is a primitive root of 10, since 


3 1 = 3, 3 2 = 9, 3 3 = 7, 3 4 h 1 (mod 10), 
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and so the powers of 3 generate all the positive numbers less than and 
relatively prime to 10. 

7.11 Since 2 is a primitive root for 13, the primitive roots of 13 will be the 
powers 2 : of 2 for which r is relatively prime to 12, that is, 

2 1 = 2, 2 s = 6, 2 7 = 11, 2 11 = 7 (mod 13), 
and so 2, 6, 7, and 11 are the primitive roots of 13. 

7.12 The least primitive root for the prime 41 is 6. The orders of the numbers 
1. 2, 3, 4, 5 modulo 41 are, respectively, 1, 20, 8. 10, 20. 

7.13 It is clear that ( bc) rs = 1 (mod p). Assume that for some t < rs we also 
have (bcf = 1 (mod p). Then fjrs. Since r and s are relatively prime, we 
can write t = uv with u\r and i>|s. Then 

b iU = b m c su = (b uv c uv ) ^ = ((bcY)i = 1 (mod p). 

Thus r |su, but r and s are relatively prime, sor|«, and since we already 
have i/|r , we conclude that u = r. Similarly, v = s, and so t = rs, and the 
order of be is rs, as claimed. 

7.14 First, write 

xP~ l - 1 = (x d - 1 )(xP- 1 ~ d + x P- l ~ 2d + . . . + 1 ). 

Now, we know that xP~ l - 1 = 0 (mod p) has p - 1 solutions by 
Fermat’s little theorem, but by Lagrange’s theorem, Theorem 6.5, the 
congruence xP~ l ~ d + x P~ l ~ 2d + • • ■ + 1 = 0 (mod p) can have utmost 
p-l-d solutions. Therefore, the congruence x d -l = 0 (mod p) has 
at least d solutions, but again since it can have no more than d 
solutions it must have exactly d solutions. Thus x d = 1 (mod p) has 
exactly d solutions. 

7.15 Since p— 1 = 12 = 2 2 ■ 3, we need to find a number n\ of order 4, and a 
number n 2 of order 3. So n x = 5 or 8, and n 2 = 3 or 9. Hence the four 
primitive roots of 13 are 

5-3 = 2, 5-9 = 6, 8 • 3 = 11, and 8 - 9 = 7 (mod 13). 

7.16 Let a be a primitive root. Then a number a r that appears in the list 

a, a 2 , , a^'P = 1 will be a primitive root of n if and only if r is 
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relatively prime to 0(h). Thus n has exactly 0 (0(n)) primitive roots. 

For n = 9, cp[ip{ 9)) = 0(6) = 2. Since we know that 2 is a primitive 
root of 9, we look in the list 2, 2 2 , 2 3 , 2 4 , 2 s , 2 6 for exponents that are 
relatively prime to 6. Therefore, 2 5 s 5 (mod 9), and 5 is the other 
primitive root. It is also clear from this list that 4 and 7, being 2 2 and 2 4 , 
respectively, will have order 3; and that 8, being 2 3 , will have order 2; 
and that 3 and 6 can’t be primitive roots because they are not relatively 
prime to 9. Thus 9 has exactly two primitive roots, 2 and 5. 

7.17 We have just seen that the statement in the hint is true for k = 3. 
Assume the statement is true for some k > 3. Then 

a 2 " ' = (a 2 " 2 ) 2 = (1 + r2*) 2 = 1 + r2 M + r 2 2 2k . 

for some r by the induction hypothesis, so a 2 * ' = 1 (mod 2 k+l ). 
Therefore, for each odd integer a, a 2 ‘ 2 = 1 (mod 2 k ), and we see that 
the order of each odd integer must be less than 0( 2 k ) = 2 k ~ l . Hence 2 k 
does not have a primitive root. 

7.18 (a) For n = 15, since 14 = 15 - 1, 13 = 15 - 2, 11 = 15 - 4 , and 
8 = 15 — 7, it is sufficient to check that 

l 4 ee 1, 2 4 = 16 = 1, 4 4 = 16 2 = l 2 = 1, 

7 4 = 49 2 = 4 2 = 1 (mod 15). 

(b) In general, let a be a positive integer less than and relatively prime 
to mn, and let r = lcm(0(m), 0(h)). Then a is also relatively prime to 
each of rn and n, so a 0(m) = 1 (mod m) and a <t>{n) = 1 (mod ri). Hence 
a r = 1 (mod m) and a r = 1 (mod ri), which means that a r = 1 
(mod mn). We just need to show that r < <p{nm). 

But recall from Problem 3.5 that 


lcm (0(m). 0(h)) • gcd(0(m), 0(h)) = 0(m)0(n), 

and we know that both 0(m) and 0(h) are even, so gcd(0(m), 0 (h)) > 2. 
Therefore, 


lcm (0(m), 0 (h)) = 


0(hi)0(h) 
gcd(0(m), 0(H)) 


0(W)0(H) 

2 


0(hjh) 

2 


< 0(mn), 


as we wished to show. Hence mn has no primitive roots. 
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7.19 In view of Problems 7.17 and 7.18 we need only show that integers of 

the form p“ and 2 p a have primitive roots. We begin by showing that p 01 
has a primitive root. Let a be a primitive root of the prime p. The idea is 
to show that a is also a primitive root for p a . We will use induction. 

Before we can even start the induction, though, there is a potential 
problem we need to deal with if by chance it should have happened 
that a p ~ l = 1 (mod p 2 ). However, if this did happen by our choice of 
the primitive root a, we simply replace a by an equivalent primitive 
root a + p, and note that the same thing could not also have happened 
for a + p, since by the binomial theorem 

(a + p)P - 1 = a p ~ l + (p - 1 )aP~ z p = 1 + (p - 1 )a p “ 2 p (mod p 2 ), 
and so ( a + p)P _1 0 1 (mod p 2 ). 

1'hus we begin the induction by assuming that a is a primitive root 
of p, and 

fl p_1 # 1 (mod p 2 ). 

Our goal is to show that a is a primitive root of p“; that is, that the order 
offlis0(p°') = p“ - p" -1 = ptf-^p- i). 

Now, the order of a must divide <p(p a ), and if n is this order, then 
u|p“-’(p- 1); but we also have a n = 1 (mod p), so p- lj/z. This means 
that n is of the form p s (p — 1), for some s < a — 1 . We will use induction 
to show that s = a - 1 . 

Thus, assume for some k> 2 that 

a P k -(p-i) i (mod p k ). 

Note that the base case of the induction was done above. However, by 
Euler's theorem, Theorem 7.1, fl /’ k “ z (p-i) = i ( mo d p k ~ l ), so we can 
write a pk = 1 + rp k ~ x , and use the binomial theorem to get 

a pk l(p_1) = (1 + rp k ~ 1 ) p = 1 + rp k # 1 (mod p k+x ). 

So a p “ "(p- 1 * =£ i (mod p“), and a is a primitive root of p“. 

Finally, we still need to show that 2 p“ has a primitive root. Let a be a 
primitive root of p“. Since <p(2p a ) = 0(2 )cp{p a ) = 0(p“), if a is odd, then 
a also has order 0(p“) modulo 2 p“, and so a is a primitive root of 2 p a . If 
a is even, then a + p a is odd, and since it also has order </>(p“) modulo 
2p“ , it is a primitive root of 2 p“ . 

We will make one additional comment about this proof. It can in 
fact happen that a p_1 = 1 (mod p 2 ) for a prime p where a is a primitive 
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root of p. The smallest instance of this is for the prime p = 40 487, and 
for the primitive root 5. It is true that 5 40 486 = 1 (mod 40 487 2 ). 

7.22 (a) The quadratic residues are 1, 4, 9, 16, and 

5 2 = 2, 6 2 = 13, 7 2 = 3, 8 2 = 18, 9 2 = 12, 

10 2 = 8, ll 2 = 6 (mod 23), 

that is, 1, 2, 3, 4, 6, 8, 9, 12, 13, 16, 18. 

If we subtract each of these from 23, then we get the quadratic 
nonresidues: 5, 7. 10, 11, 14, 15, 17, 19, 20, 21, 22. 

7.23 x — 2, y = 8 is a solution to x 2 + y 2 = -1 (mod 23). 

7.24 (c) The property that holds is: p - a is a quadratic residue if and only if 
a is a quadratic residue. 

7.25 We can see by inspection that no two of the quadratic residues of 13 
sum to - 1 modulo 13, so the only solution to x 2 + y 2 = - 1 (mod 13) is 
when one of x or y is 0 modulo 13; that is, when there is a solution to 
x 2 = -1 (mod 13) or y 2 = -1 (mod 13). So x = ±5, y = 0 and 

x — 0, y = ±5 are the only solutions. 

This is very different than for a prime of the form p = An + 3, where 
a solution to the congruence x 2 + y 2 = -1 (mod p) must have both x 
and y being nonzero because of Theorem 6.4. 

7.26 Writing 


18 • 19 = 12 2 + 10 2 + 7 2 + 7 2 
we can reduce the squares modulo 18 to get 

18 11 = (— 6) 2 + (— 8) 2 + 7 2 + 7 2 . 

Then, using the fundamental identity of Euler, we get 

18 2 - 11 19 = (12 2 + 10 2 + 7 2 + 7 2 )(( — 6) 2 + (-8) 2 + 7 2 + 7 2 ) 
= (-72 - 80 + 49 + 49) 2 + (-96 + 60 - 49 + 49) 2 
+(84 + 70 + 42 + 56) 2 + (84 - 70 - 56 + 42) 2 
= (— 54) 2 + (-36) 2 + 252 2 + 0 2 . 
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And so, dividing through by 18 2 , 

11 ■ 19 = 3 2 + 2 2 + 14 2 + 0 2 

is a smaller multiple of 19 written as a sum of four squares. 

7.27 Writing 

5 • 19 = 3 2 + 9 2 + 2 2 + l 2 
we can reduce the squares modulo 5 to get 

5 ■ 2 = (— 2) 2 + ( — l) 2 + 2 2 + l 2 . 

Then, using the fundamental identity of Euler, we get 

5 2 • 2 • 19 = (3 2 + 9 2 + 2 2 + 1 2 )(( — 2) 2 + (-1) 2 + 2 2 + l 2 ) 

= (-6 - 9 + 4 + l) 2 + (-3 + 18 - 2 + 2) 2 
+(6 + 9 + 4 + l) 2 + (3 - 18 - 2 + 2) 2 
= (-10) 2 + 15 2 + 20 2 + (-15) 2 . 

And so, dividing through by 5 2 , 

2 • 19 = 2 2 + 3 2 + 4 2 + 3 2 

is a smaller multiple of 19 written as a sum of four squares. 

We continue the descent by reducing once more modulo 2 to get 

2 • 1 = 0 2 + l 2 + 0 2 + l 2 . 

Then, using the fundamental identity of Euler, we get 

2 2 ■ 1 • 19 = (2 2 + 3 2 + 4 2 + 3 2 )(0 2 + l 2 + 0 2 + l 2 ) 

= (0 + 3 + 0 + 3) 2 + (2 - 0 - 4 + 0) 2 
+(0 + 3 - 0 - 3) 2 + (2 - 0 + 4 - 0) 2 
= 6 2 + ( — 2) 2 + 0 2 + 6 2 . 
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And so, dividing through by 2 2 , we finally have 

19 = 3 2 + l 2 + 0 2 + 3 2 


written as a sum of four squares. 

7.28 Since a = a,b = p,c = y.d = 8 (mod k), 

aa + bp + cy + dS = a 2 + b 2 + c 2 + d 2 = kp = 0 (mod k), 

and so k 2 divides the first square (aa + bp + cy + dS) 2 . 

For the same reason, k 2 divides the other three squares because 

aft - ba - cS + dy = ab - ba - cd + dc = 0 (mod k), 

ay + bS - ca - dp = ac + bd - ca - db = 0 (mod k), 

aS - by + cp - da = ad - be + cb - da = 0 (mod k). 

7.29 85 = 5 ■ 17 so all four of its divisors — 1 ,5,17,85 — are of the form 4k + 1; 
therefore, there are 4(4 - 0) = 16 representations, two of which are 
essentially different: 85 = 2 2 + 9 2 = 6 2 + 7 2 . 

The smallest number written as a sum of two squares in three 
essentially different ways is 325 = l 2 + 18 2 = 6 2 + 17 2 = 10 2 + 15 2 . 

7.30 Since 28 = 2 2 • 7, there are 24 er(7) = 24(1 + 7) = 192 representations. 
There are three essentially different ones: 

28 = 25 + 1 + 1 + 1 = 16 + 4 + 4 + 4 = 9 + 9 + 9 + 1, 

and each has 4 ■ 2 4 = 64 variations, since there are four positions for 
the “different” integer in each case, and there are 2 4 possible changes 
of signs. 

7.31 (a) The formula for a(n) is given by 


a(n) — (l + p\ + pi + ■■■ + pi’)(l + P 2 + P 2 + ' ' ' 

+ P2 ’) ■■■ (1 + pk + pi + • • ■ + Pl k ), 
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because each divisor of p\ x p e 2 2 ■ ■ ■ p[ k will occur exactly once in this 
product. We can simplify this formula by writing 

ff( „) = Pl 1+1 -1 . P2 Z+1 - 1 . . . Pl k+l ~ 1 

Pi - 1 p2 1 pk - 1 

(d) Here the assumption is that 2" - 1 is prime and we must show that 
N = 2" _1 (2" - 1) is perfect; that is, we must show that a(N) = 2 N. But 
since 2" - 1 is prime, o(2 n - 1) = 1 + (2" - 1) = 2", so 


or (AO = a(2 n ~ 1 (2 n - I)) = a(2'' _1 )cr(2" - 1) = (2" - 1)2" = 2 N. 


7.32 Suppose 4 | (a 2 + b 2 + c 2 ). Since any square is congruent to 0 or 1 
modulo 4, it follows that a, b, and c must be even. Therefore, if a 
number m = a 2 + b 2 + c 2 is a sum of three squares, and 4|m, then 

f = (§) 2 + (f) 2 + (f) 2 is also a sum of three squares. Hence, since no 
number of the form 8k + 7 is a sum of three squares, no number of the 
form 4"(8k + 7) can be a sum of three squares. 

7.33 454 = 7 3 + 3 3 + 3 3 + 3 3 + 3 3 + l 3 + l 3 + l 3 . 


7.34 


The fourth powers are 1, 16, 81, ... , and so 15 requires 15 fourth 
powers, 31 requires 16, 47 requires 17, 63 requires 18, and 79 requires 19: 
79 = 4 • 16 + 15 ■ 1. (Note that the next number in the sequence, 95, is 
greater than 81, and only needs 15 fourth powers.) 

The fifth powers are 1, 32, 243, . . . , and so 31 requires 31 fifth 
powers, 63 requires 32, 95 requires 33, and so on, until 223 requires 37: 
223 = 6- 32 + 31 ■ 1. 


Simlarly, 703 is the first number that requires 73 sixth powers, and 
703 = 10 -64 + 63 • 1. 


In general, then, you simply want 


( - 1 ) terms of the form 2 k ; 


one more such term would exceed 3*. And you want ( 2 k - 1) terms of 
the form 1*. Thus 


n= ([(§)*] - l) -2 k + {2 k - 1). l k 


works perfectly, since 


([(§)‘J -!)+ (2* -1) =[(!)* 


+ 2 - 2 . 


r* 
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7.35 When p = 2, the “sum” is just 1 for all values of k, so assume p is an odd 
prime. 

If k is a multiple of p - 1, then by Fermat’s little theorem each kth 
power a k is congruent to 1 modulo p, so the sum will be congruent to 
P- 1- 

For any other value of k, the sum is congruent to 0 modulo p. Let a 
be a primitive root for the prime p. Then the numbers 1, 2, 3, ... , 
p - 1 can be represented by powers a. a 2 , a 3 , ... , a p ~ l of the primitive 
root in some order. Furthermore, we know that a p ~ x = 1 (mod p), so 
the sum of the kth powers modulo p is given by 

1 + a k + a 2k + a 3 * + . . . + a (P-W = ~ 1 . 

a k - 1 

Now, the numerator on the right is divisible by p by Fermat’s little 
theorem, but also, since k is not a multiple of p - 1, we know that the 
denominator is not divisible by p; hence the sum is congruent to 0 
modulo p as claimed. 

7.36 By Problem 5.32 we need only show that numbers of the form 4 n + 2 
can be written as a sum or difference of three squares, but we can write 
4n + 2 as 

4n + 2 = (2n + l) 2 - (2 n) 2 + l 2 . 

Alternatively, we can write: for odd integers, 

2n + 1 = (n + l) 2 — n 2 + 0 2 ; 

and for even integers, 

2n = n 2 -(n- l) 2 + l 2 . 

7.37 We can write any number of the form 6k as a sum or difference of five 
cubes by adding 0 3 to the identity in the hint, and we can write any 
number of the form 6k ± 1 as a sum or difference of five cubes by 
adding ±1 3 to this identity. 

For numbers of the form 6k + 2, replace k by k - 1 in the identity in 
the hint to get 


6 (k - 1) = k 3 + (k - 2) 3 - 2 • (k - l) 3 ; 
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adding 8 to both sides yields 

6* + 2 = k 3 + (k - 2) 3 - 2 • (* - l) 3 + 2 3 . 

For numbers of the form 6k - 2, replace k by k + 1 in the identity in 
the hint to get 


6 (k + 1) = (* + 2) 3 + k 3 - 2 • (k + l) 3 ; 
subtracting 8 from both sides yields 

6k - 2 = {k + 2) 3 + k 3 — 2 • (k + l) 3 - 2 3 . 

Finally, for numbers of the form 6k + 3, replace k by k - 4 to get 
6(k — 4) = (Jt — 3) 3 + (k - 5) 3 - 2 • (k - 4) 3 ; 
adding 27 to both sides yields 

6k + 3 = (jt - 3) 3 + (jt - 5) 3 - 2 • (k - 4) 3 + 3 3 . 

An alternative proof makes use of the fact that 6 1 (n 3 - n), since 
n 3 - n = n(n + 1 )(n- 1). Therefore, n 3 - n = 6k, for some k. Hence we 
can write any integer n as 

n = n 3 - 6k = n 3 + 2 ■ k 3 - (k+ l) 3 + (k- l) 3 , 

a representation of n with five cubes. 

Note, however, that while this second proof is quite slick, it is not 
nearly as good from a practical point of view. For example, if n = 15, 
the first argument lets k = 2, and produces 15 = 

(_ 1 ) 3 + (-3)3 _ 2(— 2) 3 + 3 3 , that is, 15 = -1 - 27 + 8 + 8 + 27; 
whereas the second argument lets k = 560, and produces 
15 = 15 3 + 2 • 560 3 - 561 3 - 559 3 = 351 235 375 - 351 235 360. 

7.38 95 800 4 + 217 519 4 + 414 560 4 

= 84 229 075 969 600 000 000 

+2 238 663 363 846 304 960 321 

+29 535 857 400 192 040 960 000 
= 31 858 749 840 007 945 920 321 = 422 481 4 . 

7.39 It is routine — yet somehow simultaneously tedious and rewarding — to 
verify these two identities using (a + b) 3 = a 3 + 3 a 2 b + 3 ab 2 + b 3 and 
(a + b) s = a 5 + 5 a 4 b + 10 a 3 b 2 + 10a 2 fi 3 + Sab 4 + b 5 . 
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That there are infinitely many positive values for the cubes in the 
first identity is guaranteed by choosing j > s/g. 

In the second identity it follows that if 25 < (*) 5 < 75, then each 
term will be positive. Therefore, by choosing x = 2 y we can produce 
infinitely many solutions since 25 < 32 < 75. 

The two smallest solutions correspond to y = 1, x = 2; and to 
y — 2, x = 4. The first value of y that produces more than one solution 
is v = 3, since we know x = 6 works, but x = 7 also works because 
(|) <75. Similarly, for y = 6, we see that x = 12, 13, 14 also work; 
whereas for y = 12, not only do * = 24. 25, 26, 27, 28 work, but now 
x = 22 also works because (y|) 5 > 25. 

7.40 p + p 2 = ~ iv/3 = -1, so 1 + p + p 2 = 1 - 1 = 0. 

When n = 4, let a = i. Then a 2 = -1, a 3 = -i, and a 4 = 1, so 
1 T o T o 2 T o 3 = 1 T i — 1 — i = 0. 

7.42 Let a = e tt = cos | + i sin | = —l/ 2 . Squaring, we get a 2 = ~ l+ ^ . 

Note that since rr 2 = p, where p is the cube root of unity from Problems 
7.40 and 7.41, we already know that cr 6 = 1. 

Now, multiplying a 2 by a, we get a 3 = -1. Thus o A = -a = -- 1t 2 iV3 , 
and cr 5 = -a 2 = So, 1 + a + a 2 + ct 3 + a 4 + a 5 = 


1 + / s/2 — 1 + i s/2 — 1 — i s/2 1 — i s/2 

1 + — o ^ 1 + — ~ ^ — = o. 


Finally, the reason this sum is 0 is because, as we know from Problem 
5.18, since a / l,we can write 


1 + & + cr 2 + a 3 + cr 4 + 


a 6 - 1 

a — 1 


which is 0 because a 6 = 1 . 

7.43 Since mis positive, ^ + j > ^, and since d and s must both be greater 
than 2, the only solutions correspond to: 
d = 3, s = 3: so m = 6, which is the tetrahedron; 
d = 3, s = 4: so m = 12, which is the cube; 
d = 3, s = 5: so m = 30, which is the dodecahedron; 
d = 4, s = 3: so m = 12, which is the octahedron; 
d = 5, s = 3: so m = 30, which is the icosahedron. 
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Chapter 8 

8.1 For a positive integer n, the expression 

_ a(a + 1) b(b + 1) c(c + 1) 

” ~ 2 4 2 + 2 

representing « as a sum of three or fewer triangular numbers (not all of 
a, b, and c are 0), can be rewritten — multiplying by 8 and adding 3 to 
both sides — as 

8 n + 3 = (2a + 1)^ + (2b + 1)^ + (2c + 1)“, 

which represents a number 8 n + 3 as a sum of three odd squares. The 
equivalence of the two statements now follows. 

8.2 There are twelve quadratic residues of the prime 43 from 1 to 21: 

1, 4, 6, 9, 10, 11, 13, 14, 15, 16, 17, 21; therefore, there are nine 
quadratic nonresidues: 2, 3, 5, 7, 8, 12, 18, 19, 20. Hence there are 
three representations of 5 as a sum of three triangular numbers: 

5 = 3 + l + l = l + 3 + l = l + l+3. 

8.3 There are forty-six primes less than 200. 

8.4/8. 5 The comparisons are listed in the following table. Note that, as Gauss 
believed, the logarithmic integral does give a somewhat better 
approximation than and both ratios do seem to approach 1 as x 
gets large. 



n(x) 

X 

tt(x) 

7t(x) 

Li(x) 

Li(x) 

71 (X) 


In x 

X_ 

In x 

x/ In x 

7t(x) 

Lip) 

200 

46 

37.75 

8.25 

1.22 

49.15 

3.15 

.936 

500 

95 

80.46 

14.54 

1.18 

100.8 

5.8 

.942 

1000 

168 

144.7 

23.3 

1.16 

176.6 

8.6 

.951 

5000 

669 

587 

82 

1.14 

683.2 

14.2 

.979 

10 000 

1229 

1086 

143 

1.13 

1245 

16 

.987 

100 000 

9592 

8686 

906 

1.10 

9629 

37 

.996 

1 000 000 

78 498 

72 382 

6116 

1.08 

78 627 

129 

.998 
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8.6 Since both Li(x) and ^ approach infinity as x approaches infinity, we 
can apply THopital’s rule. So, we take the derivative of each of these 
functions, using the fundamental theorem of calculus for Li(x), and 
the quotient rule for and get 


lim 

*—►00 


Li(-x) 
x/ In x 


lim 

*—►00 


1 

In x 
In x-\ 
In 2 x 


lim 

X — >00 


In x 
In x - 1 


= 1. 


8.7 Fermat’s condition for an odd prime p to be of the form p = x 2 + 3 y 2 is 
that p = 1 (mod 3), and examples are 7, 13, 19, 31, 37, 43. 

For primes of the form p = x 2 + 2 y 2 , the condition is that p = 1 or 
p = 3 (mod 8), and examples are 3, 11, 17, 19, 41 , 43. 


8.9 Assume -n is a quadratic residue of p; then a 2 = - n (mod p), for some 
a, so a 2 + n = 0 (mod p), and p | a 2 + n ■ l 2 . 

Conversely, assume p \ x 2 + ny 2 , for two relatively prime integers x 
and y. Now, p and y are relatively prime — otherwise, since p^n, p 
would divide both x and y — so y has an inverse z such that yz = 1 
(mod p), by Theorem 6.1. Hence 

-n = -n(yz) 2 — z 2 (- ny 2 ) = z 2 (x 2 ) — ( zx ) 2 (mod p). 


and —n is a quadratic residue of p. 


8.11 In Theorem 6.4, the issue is whether - 1 is a quadratic residue of the 

prime p. For p of the form p = An + 1, by Euler’s criterion, Theorem 8.1, 
we get 

(y) = (~!) ¥ = (-1) 2 " = 1 (mod p), 

and so -1 is a quadratic residue. 

For p of the form p = 4n + 3, 

( 7 O - (-D^ = (-l ) 2n+1 = -1 (mod p), 
and -1 is a quadratic nonresidue. This proves Theorem 6.4. 


8.12 Since we can write 


(f) = (f )(f) 


and, since -1 is a quadratic residue when p is of the form An + 1 and a 
quadratic nonresidue when p is of the form An + 3, results (a) and (b) 
follow immediately. 
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8.13 For a = 6, we get 


6, -7, -1, 5, -8, -2. 4. -9, -3, 

so n = 6, and (^) = (-1)" = (-1) 6 = 1. Therefore, by Gauss's lemma, 6 
is a quadratic residue of 19. 

For a = 3, we get 

3, 6, 9, -7, -4, -1, 2, 5, 8, 

so n = 3, and (^) = (— 1)" = (-1) 3 = — 1. Therefore, by Gauss’s lemma, 
3 is a quadratic nonresidue of 19. 

8.15 We saw that for p = 13, of the six numbers 

3, 6, 9, 12, 15, 18, 

the first two numbers, 3 and 6, fall between 0 and f , and are positive; 
the numbers 9 and 12 fall between | and p, and are negative; and the 
last two numbers fall between p and \ p, and are positive. 

Similarly, for p = 7, the list is 

3, 6, 9, 

and 3 falls between 0 and f , and is positive; 6 falls between f and p, 
and is negative; and 9 falls between p and §p, and is positive. 

Therefore, the pattern is that for any prime of the form 12 k + 1 or 
12 k + 7, there will be an equal number of numbers from the list in each 
of the three intervals. Moreover, for primes of the form 12k + 1, there 
will be an even number of numbers in the interval between | and p, 
and so 3 is a quadratic residue. For primes of the form 12 k + 7, there 
will be an odd number of numbers in the interval between | and p, 
and so 3 is a quadratic nonresidue. 

For primes of the form 12k + 11 or 12k + 5, the pattern is that there 
will be one more number in each of the last two intervals than in the 
interval between 0 and f . For example, let p = 11; the list is 

3, 6, 9, 12, 15, 

and 6 and 9 are negative, while the rest are positive. Similarly, for 
p = 17, the list is 

3, 6, 9, 12, 15, 18, 2124, 

and 9, 12, and 15 are negative, while the rest are positive. 
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Therefore, for primes of the form 12k + 11, there will be an even 
number of numbers in the interval between | and p, and so 3 is a 
quadratic residue. For primes of the form 12 k + 5, there will be an odd 
number of numbers in the interval between | and p, and so 3 is a 
quadratic nonresidue. This proves Theorem 8.4. 

8.16 The pattern is that there will always be the same number in each 
interval except the first, which has one fewer. Since there will always be 
the same number of numbers in the two intervals [|, p] and [§p, 2p], 
then there will be an even number of negative numbers. So, by Gauss’s 
lemma, 5 is a quadratic residue for any prime with remainder 19 
modulo 20. 

8.17 First, observe that the intervals [ - 13, -y] and [ - 26, -y] also 
contain an odd number of multiples of 5. We get the interval 1 1, 7] by 
adding 20 to -13 and 10 to -y; we get the interval [y, 14] by adding 
40 to -26 and 30 to - ~ . Since we always add multiples of 10, these 
intervals still contain an odd number of multiples of 5. 

8.18 Since 1984 = 64-31, for each of the three primes p we have 

So 

for p = 3, (ifi) = (f) = (!) = l; 

for p = 23, (Mi) = (|) = (A) = (£) (£) = (&) = 1, by 
Theorem 8.3; 

for p = 29, (iff 1 ) = (§) = (I,) = -1, by Theorem 8.3. 

Therefore, since 1984 is not a quadratic residue modulo 29, it is not a 
quadratic residue modulo 2001. 

8.19 Substituting y = 2 ax + b and d = b 2 - 4ac into y 2 = d (mod p) and 
dividing by 4a yields ax 2 + bx + c = 0 (mod p). The congruence 

2 ax + b = y (mod p) has a unique solution by Theorem 6.1 , since 2 a 
and p are relatively prime. 

Solving the specific congruence, we get b 2 - 4 ac = 49 - 36 = 13, and 
6 2 = 13, 17 2 = 13 (mod 23); so we solve 


6x - 7 = 6 (mod 23) and 6x - 7 = 17 (mod 23), 
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which become 


6x = 13 (mod 23) and 6x = 1 (mod 23), 

and we get x = 6, and x = 4 for the solutions. 

Checking both of these in the original congruence, we get 

3(6)“ — 7(6) + 3 = 108 — 42 + 3 = 69 = 0 (mod 23), 


and 


3(4)2 _ 7(4) _|_3_ 4 g_28-|-3 — 23 = 0 (mod 23). 


8-20 ( a ) (I) = (%r) = (i) = 1 , by Theorem 8.3. 

(b) For we first compute (^) = (^|) = 1, which is true since 
7 2 = 11 (mod 19); so, by the law of quadratic reciprocity, 

/ IQ \ / \ , - x n oi . . / ir* s r J 


sm)(W) = (-1) 9 ' 81 = — 1, and so (^) = -1. 


/ C 29 F wii# ),Wefi M t «i C ° mpUte = ^ = ® = ^ = " 1;thus 
( 163 ) ( ~29~ ) =(-!) = 1, by the law of quadratic reciprocity, so we get 

(!) = -!• 


8.21 (a) There are 54 lattice points inside the small rectangle, as we expect 

since = 12=1 . 12=1 = 54. 

(b) The number of lattice points below the diagonal in the even 
columns 10. 12. 14, 16, 18 are 6, 8, 9, 10, 12; and the number of lattice 
points below the diagonal in the odd columns 9, 7, 5, 3, 1 are 
6, 4. 3, 2, 0. The parities match perfectly: even, even, odd, even, even, 
(b) As i runs from 2 to 18, the iq terms are 


26, 52. 78, 104, 130, 156, 182, 208, 234; 


hence the residues r, become 


7, 14, 2, 9, 16, 4, 11, 18. 6, 


and, changing the sign for the odd residues— that is, writing (-I/t, for 
each term, produces 


12, 14, 2, 10, 16. 4, 8, 18, 6. 
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(d) By Eisenstein’s lemma, 

_ (_ 1 )L§J+Lf J + lJJ + L^J + L^ J + LW J + LW J + L^ J+L^fj 

^ 1 + 2 + 4 + 5 + 6 + 8 + 9 + 10+12 ^ ^^57 

The exponent, 57, is of course also the number of lattice points in the 
even columns below the diagonal. 

(e) There are fifty-seven lattice points in the even-numbered rows to 
the left of the diagonal, and twenty-seven lattice points in all the rows 
to the left of the diagonal in the small rectangle. Either way, we get 

({f) = (-l) 57 = (-D 27 = -l. 

8.22 For p = 13, we have k = 3, so we compute \ ( 3 ) = 10 and so a = -3. 
Then 6 !( — 3) = -2160, which is congruent to 11 modulo 13, and so 
b = -2. Therefore, we write 13 = 3 2 + 2 2 . 

For p = 89, we have k = 22, so we compute \ (^) = 

1 052 049 481 860, which is congruent to 5 modulo 89, and so, a = 5. 
Then, 441(5) = 

13291357873942243840218129055073079451598192640000000000 = 
81 (mod 89), and so b — - 8 . Therefore, we write 89 = 5 2 + 8 2 . 


Chapter 9 

9.2 We begin with 

55 2 - 2921 = 2 3 • 13, 
56 2 - 2921 = 5-43, 
57 2 - 2921 = 2 3 • 41, 
58 2 - 2921 = 443, 

59 2 - 2921 = 2 4 • 5 • 7 
60 2 - 2921 = 7-97, 

61 2 - 2921 = 2 s • 5 2 , 


62 2 -2921 = 13-71, 
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63 2 -2921 = 2 3 ■ 131, 
64 2 - 2921 = 5 2 • 47, 
65 2 - 2921 = 2 3 • 163. 


66 2 - 2921 = 5 • 7 - 41. 


At this stage, of the twelve lines, five lines have been eliminated by 
the sieve of mesh size 50, and among the remaining seven lines, there 
are four lines where the prime decomposition involves only the four 
primes 2, 5, 7, 41 and the collective exponent on each prime is even: 

57 2 — 2921 — 2 3 • 41 , 

59 2 - 2921 = 2 4 • 5 • 7, 

61 2 - 2921 = 2 5 • 5 2 , 


66 2 — 2921 = 5-7-41. 


This looks perfect! But, in this case, we notice that 
57 • 59 ■ 61 ■ 66 = 603 (mod 2921) 


and also, amazingly, 


2 6 • 5 2 • 7 ■ 41 = 603 (mod 2921). 

This means it will do us no good at all to write 

(57 ■ 59 ■ 61 • 66 + 2 6 • 5 2 • 7 • 41)(57 • 59 • 61 • 66 - 2 6 • 5 2 • 7 • 41) 
= 0 (mod 2921), 

because the right-hand factor is congruent to 0. 

However, this piece of extremely bad luck is immediately followed 
by equally good luck, because 


6 7 2 - 2921 = 1568 = 2 5 • 7 2 
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can be combined with 

61 2 - 2921 = 800 = 2 s • 5 2 

to produce 

61 2 6 7 2 = 2 10 5 2 7 2 (mod 2921). 

Now, we write this as 

(61 - 67 + 2 s ■ 5 • 7)(61 -67 - 2 s • 5 ■ 7) = 0 (mod 2921), 
which reduces, modulo 2921, to get 

2286 • 46 = 0 (mod 2921). 

At this point we could use the Euclidean algorithm to compute either 
gcd(2286, 2921) or gcd(46, 2921), but we don’t need to, since 
46 = 2 • 23, and so we can immediately try 23 as a factor of 2921, and 
discover that 2921 = 23 • 127. Of course, we could also notice that 
127|2286. 

9.3 We use fast exponentiation to show that 2 9997 ^ 2 (mod 9997), from 
which it follows by Fermat’s little theorem that 9997 is composite. 
Specifically, 2 9997 = 8192 (mod 9997). (And 9997 = 13 • 769.) 

9.4 For Method 1, a reasonable average is about 7 seconds per division. In 
Problem 8.4, we saw that 1000/ ln(1000) % 145, so there are roughly 
145 primes up to Vl 000 283 that might need to be checked, which 
would take about 20 minutes. (Note that 1 000 283 = 941 ■ 1063, so 
you have to go almost all the way up to Jn before you find a factor.) 

For Method 2, beginning with 2 15 = 32 768, a reasonable average for 
computing the next four congruences is about 20 seconds per 

congruence. Twelve more congruences are needed to reach the number 

2 1 000 283 

, so the total time by this method would be slightly more than 
5 minutes. 

9.6 (a) Since p is prime, p\(2 p - 2), and so we can write 

2 2P ~ 2 - 1 = (2 p - \){2 2P ~ 2 ~ p + 2 2 ^ 2 - 2 P + 2 2r ~ 2 ~ 3 P + • • • + 1), 

and (2 P - 1)|(2 2 ' 2 - 1). Therefore, 2 P - 1 also divides 
2(2 2, ’~ 2 - 1) = 2 2, ’~! - 2; that is, 2 P - 1 is a pseudoprime. 
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(b) First, we show that 2" — 1 is composite. Since n is composite, we can 
write rt — ab for two integers a, b > 1. Therefore, we get 

2" - 1 = (2 a - l)(2"- fl + 2 n ~ 2a + 2 n ~ 3a + ■ ■ ■ + 1), 

and 2" - 1 is composite. 

Next, we show that 2 n - 1 is a pseudoprime, that is, that 2" - 1 
divides 2 2 "~ 1 - 2. Since 2" - 1 is odd, it will suffice to show that 2" — 1 
divides 2 2 "~ 2 - 1. But, since n|(2" - 2), we can write 

2 2 "- 2 - 1 = (2" - l)(2 2 "~ 2 ~" + 2 2 "~ 2 ~ z ” + 2 2 "~ 2 ~ 3n + ■ ■ • + 1), 

and (2” - 1)|(2 2 "~ 2 - 1), showing that 2" - 1 is a pseudoprime. 

Hence, since 341 is a pseudoprime, we have an infinite sequence of 
pseudoprimes: 

341 , 2 341 - 1 , 2' 234 '- 1 ' - 1 , 2( 2,234I “’’- 1 ) - 1 , . . . . 

(c) First, we note that 2 4±i is an integer since p is odd, and so we can 
write 

2 p + 1 = (2 + 1)(2 P_1 - 2 P ~ 2 + 2 P ~ 3 + ■ • • + 1). 

Next we observe that 2 2p = 1 (mod n) since 
3/i = (2 P + 1){2 P — 1) = 2 2p — 1. 

We claim that 2 p | n - 1. To verify this claim we write 

3 (n - 1) = 3n - 3 = 2 lp - 2 2 = (2 P - 2){2 P + 2) = 2(2 p - 1 - 1){2 P + 2). 

Therefore, since p \{2 p ~ l - 1), we conclude that 2p \ n- 1. 

Now we write n - 1 = 2pk, for some integer k, and since 2 2p = 1 
(mod //), we get 

2"-i = 2 2pk = ( 2 lp ) k = l k = \ (mod //). 

Hence // is a pseudoprime. 

9.10 (c) 41 041 = 7- 11 • 13 -41. 

(d) 825 265 = 5 • 7 • 17 ■ 19 • 73. 

(e) For k = 6, we get 37 • 73 • 109 = 294 409. Note that for k = 5, even 
though 18& + 1 = 91 is not itself prime, the resulting number 

31 • 61 • (7 • 13) is an absolute pseudoprime. 
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The following numbers are also absolute pseudoprimes: 

63 973 = (6 + 1)(12 + 1)(18 + 1)(36 + 1), 

126 217 = (6 + 1)(12 + 1)(18 + 1)(72 + 1), 

9 976 333 = (6 + 1)(12 + 1)(18 + 1)(576 + 1). 

9.11 (a) In this case, 2 280 = 1 (mod 561), but 2 140 = 67 (mod 561), so we 

conclude that 561 is not prime. 

(b) In this case, 50 3S = 560 = — 1 (mod 561). So, 50 70 = 1 (mod 561), 
and of course then it follows that 50 140 = 50 280 = 1 (mod 561) as well. 
Since the condition of the test is satisfied for this value a = 50, if this 
were the only value we tested and it was a randomly chosen value, we 
would be inclined to conclude that 561 is prime with a less than 25% 
chance that it is composite. 

(c) 


10 


.000 000 954 


1 


1 000 000 ' 


20 


.000 000 000 000 909 < 


1 


1 000 000 000 000 


9.12 We prove that 3*|2 3 *+ 1 for all k > 1. Clearly, this is true for k — 1 , since 
3 1 2 3 + 1 . 

Now, assume that 3*|2 3l + 1 for some k > 1. We must prove that 
3 k+1 1 2 3 * ' + 1. Note that since 3*|2 3t + 1, it is also the case that 3|2 3t + 1; 
hence 


2 3 ‘ = -1 (mod 3). 


Now we write 

2 3 ‘ +, + 1 = ( 2 3 ‘) 3 + 1 = ( 2 3 ‘+ l ) (( 2 3 ‘) 2 - 2 3 ‘+ l ). 

So 3| (2 3 *) 2 — 2 3 ‘ + 1, since 2 3 ‘ = -1 (mod 3), and 3 fc 1 2 3 * + 1 by the 
induction assumption; therefore, 3 k+1 |2 3k 1 + 1, as desired. 

9.14 If an even number is perfect, then its binary representation will be, 
reading left to right, p Is, followed by p - 1 0s, for some prime p. For 
example, the binary representation of the perfect number 
496 = 2 4 (2 S — 1) is 111110000. 
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9.15 I he first conjecture is true; the second is false. The sequence continues, 
for Mersenne primes 2 13 - 1 and 2 17 - 1: 

6, 28. 496. 8128, 33 550 336, 8 589 869 056 ; 

hence the 6s and 8s do not alternate. 

The final digit is determined by the prime p modulo 4. For p = 2, the 
final digit is of course 6. If p has the form 4k + 1, then since 2 4 = 16, it 
is clear that 2 ik ends in 6, and also that 2 4k+l ends in 2; hence 2 4k+1 - 1 
ends in 1. We conclude that 2P~ l (2P - 1) = 2 4k (2 4k+1 - 1) ends in 6. 

If p has the form 4k + 3, then, again since 2 4 = 16, we can see that 
2 4k+2 ends in 4, and also that 2 4k+ 3 ends in 8; hence 2 4k+3 - 1 ends in 7. 
We conclude that 2P ~ 1 (2? - 1 ) = 2 4k+2 ( 2 4k+ 3 - 1 ) ends in 8. 

In fact, it can be shown that, in this last case, the last two digits are 
always 28. 

9.16 Let 

n = P\ e 'p 2 e2 ■ ■ ■ p r er 

be a prime factorization of n where the p, are distinct odd primes and 
all exponents are assumed to be positive. Since n is perfect, we have 
a{n) — 2 n, and so we get 

a(p\ e ')o(p 2 ei ) ■ ■ ■ a{p r e ’) = 2 n. 

Therefore, since n is odd, exactly one of the cr(pi e ‘) is even— but not 
divisible by 4 — and the rest are odd. Assume, without loss of generality, 
that the one that is even is u(p x e ' ). 

Now, for i > 1, a(p , e ') is odd, and 

cr{pi ei ) = 1 + pi + pi 2 + pi 3 + ■ • • + pi e ‘, 

and so e,- must be even. Thus p 2 £ ’ 2 p 3 t ’ 3 • • • p r e ’ is a square, and we can 
write 

W 2 = P2 ez p3 2 ■ ■ • pr* ■ 

Similarly, since a{p x e ') is even, we conclude that e x must be odd. If 
pi = 3 = -l (mod 4), then 

o{pi e ') = 1 + pi + pi 2 + ■ • • + p! ei = l- l + . .. + i_i = o (mod 4). 
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But this is impossible since 4 does not divide cx(pi fl ). Hence p i = 1 
(mod 4). 

Finally, since a(p 1 ei ) = 2 (mod 4) and p\ = 1 (mod 4), it follows 
from 


o(p\') = 1 + pi + pi 2 + • • • + p\ e ' = 1 + 1 + 1 + -- - + 1 = 2 (mod 4) 


that f?i = 1 (mod 4). Therefore, we can write n — p\ e 'nf-, exactly as 
claimed. 

9.18 (a) a(n ) = a(3-2 k ) = a(3)a(2 k ) = 4(2 k+l - 1) = 8(2*) -4 = 

2(3 ■ 2 k ) + (2 ■ 2 k - 4), and this is greater than 2n = 2(3 • 2 k ) since k> 2. 
(b) a(n) = a(p k ) = P ~ 1 , and we need to show that this is less than 
2 n = 2 p k . But 


(p-l)(2p k ) = 2p k+l -2p k = p k +'+(p k +'-2p k ) > p M - 1 . 

since p > 2; therefore, o(ri) < 2 n. 

(C) ct(945) = cr(3 3 • 5 • 7) = a( 3 3 )a(5)a(7) = 

(1 + 3 + 3 2 + 3 3 )(1 + 5)(1 + 7) = 40 ■ 6 ■ 8 = 1920 and 1920 > 2 • 945, so 
945 is abundant. 

(d) We will prove that the sequence 

g(l) cr (2) q(3) 

1 ’ 2 ' 3 ’ 

is unbounded-, that is, we will prove that for any number r there is a 
value n such that ~ > r. (For example, this means if r = 1 000 000, 
then we can find an integer n such that ^ > 1 000 000.) 

The key step will be to let n — k\ for a sufficiently large value of k. So 


g(») 

n 


^3 > T+2+3+ 

d\k\ 


1 
k ’ 


because the divisors of kl certainly include 1, 2, 3 k (as well as 

other numbers since k > 3). 

But it is well known from calculus that the harmonic series 


1 + 7 + l + 5 + ' ' ' 
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diverges; that is, it grows without bound. Therefore, given a number r, 
there is a value k such that 

- + - + - + -- - + ->r 
1 ‘ 2 ^ 3 ' + k > ' 

and so, for n — kl, we have > r. 

Now, since the sequence ^ ^ , . . . is unbounded, this means 
there are infinitely many superabundant numbers because we can 
never reach a largest value of 

9.19 The only perfect number for which this is true is 6. First, we rule out the 
possibility that n could be an odd perfect number with this property 
(even though it is extremely unlikely that any odd perfect numbers 
exist). 

So, assume n is an odd perfect number and that a (cr(n)) is perfect. 
Now, o(o(n)) = ct(2 n) — a(2)o(n) = 3-2 n = 6n. Thus, byTheorem 9.1, 

6n = 2 P ~ 1 (2 P — 1), 

where p is prime and 2 P - 1 is prime. But n is odd, so p = 2, and we 
conclude that n = 1, which is not a perfect number after all. This 
contradiction shows there are no odd perfect numbers with the 
property that a (a(«)) is perfect. 

Now assume n is an even perfect number and that a (o (n)) is perfect. 
By Theorem 9.1 , we can write 


n = 2 P-1 (2 P - 1), 


where p is prime and 2 P - 1 is prime. Note, then, that 
a{2 p -\) = \+ {2 p -\) = 2 p . 

Therefore, 

o(o{n)) = o(2n) = cr( 2 P (2 P - 1)) = o(2 p )a(2 p - 1) = ( 2 P+1 - 1)2^. 

But if this number is perfect, then, byTheorem 9.1, p + 1 is prime. 
Hence p cannot be an odd prime, and so p = 2. 

Thus n = 2 2-1 (2 2 - 1) = 2 ■ 3 = 6. 

9.20 (a) First, we write q = 2 kp + 1. Then, since 2 P = 1 (mod q), 

2^ = 2 kp = (2 p ) k = l k =l (mod q). 
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Thus, by Euler’s criterion, Theorem 8.1, (|) = 1 , and so, by Theorem 
8.3, q = ± 1 (mod 8). 

(b) Since p = 19, any prime factor of 2^-1 will be of the form 38k + 1 . 

Thus, among the potential divisors in the list 39, 77, 115, 153 723 

(note that |_n/ 524 287J = 724), only 191, 229, 419, 457, 571, 647 are 
prime, and among these numbers only 191 , 457, 647 need to be 
checked since they are the only ones congruent to ±1 modulo 8. 

(c) For p = 43, the potential divisors are 87, 173, 259. 345, 431, . . . , 
and the first prime 173 is congruent to 5 modulo 8, so it doesn’t need to 
be checked. The next prime 431 is congruent to 7 modulo 8, so it needs 
to be checked and, in fact, it divides 2 43 - 1: 

8 796 093 022 207 = 431 • 20 408 568 497. 

Amazingly, the very first number we tried actually worked! 

9.21 By the formula for the Fermat numbers from Problem 5.29, 

FoF\F 2 ■ ■ ■ F n _ i = F„ - 2. 

Thus we have 

2 2 " — 1 = F n — 2 — FqF\F 2 ■ ■ ■ T/j-i- 

But these n Fermat numbers are pairwise relatively prime (see Problems 
5.27 and 5.29), so each has a prime divisor that does not appear in the 
prime factorization of any of the other Fermat numbers; hence 2 2 " - 1 
has at least n distinct prime divisors. 

9.23 This is a proof by contradiction. Suppose p is the largest prime. Let q be 
a prime divisor of 2P - 1 . Then 2 P = 1 (mod q), and so the order of 2 
modulo q is p. Thus the p numbers 2, 2 2 , 2 3 , . . . , 2 P are all distinct 
modulo q and none of them are congruent to q; hence p < q - 1 and q 
is a prime larger than p. This contradiction means there are infinitely 
many primes. 

9.26 (a) Since 2 59 - 1 = 576 460 752 303 423 487 = 

179 951 • 3 203 431 780 337 is the prime decomposition of 2 59 - 1 , you 
have to search through all the numbers of the form 118k + 1 for 
potential prime divisors until you reach the smallest prime divisor 
179 951. The prime number theorem tells us that about 8% of the 
numbers in this range will be prime, but our sequence has only odd 
numbers, so about 16% would be prime, and we can expect to be able 
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to ignore half of these since they will have the wrong remainder 
modulo 8. Therefore, we still need to check— that is, try as potential 
divisors— about 122 primes. 

(b) The nice thing about the Lucas-Lehmer test is that the number of 
steps depends on p in such a clear way because you just compute the 
sequence— and each step is the same— until you get to s p _ 2 after p - 2 
steps. So, in this case, it takes fifty-seven steps. 


Chapter 10 

10.4 (a) 2 C 2 Li 2 (1000) = 45.8. 

(c) 2 C 2 Li 2 (l 000 000 000) = 3 425 299. 

10.7 Four numbers of the form p, p + 2, p + 6, and p + 8 are a prime 
quadruplet if all four numbers are prime. 

10.12 The integer n = 468 is the first integer for which this inequality fails; of 
course, any larger integer will do just as well. We have already verified 
Bertrand’s postulate up to n = 5000 in Problem 10.11. 

10.13 Since 2 and 3 are prime, this statement is true for n = 2 and n = 3. So 
consider n > 3. 

If n is even, write n = 2k, where k > 1. By Bertrand’s postulate, there 
is a prime p such that k < p < 2k. Thus 2k < 2 p, and, as desired, 
p < 2k = n < 2p. 

If n is odd, write n = 2k + 1, where k > 1 . By Bertrand’s postulate, 
there is a prime p such that k < p < 2k. Thus 2k < 2 p, but this means 
that the odd number 2k + 1 is also less than even number 2 p. Hence 
p < 2k + 1 = n < 2p. 

10.15 By Bertrand’s postulate there is a prime p such that \ < p < n. This 

prime p occurs exactly once in the prime factorization of n\; therefore, 
n\ is not a square. 

10.17 First, assume that Goldbach’s conjecture is true. Let n > 5 be an integer. 
We consider two cases. If n is even, then n — 2 is also even, so we can use 
Goldbach’s conjecture to write n- 2 = p + q as a sum of two primes p 
and q; but then n = p + q + 2 is a sum of three primes. Next, if n is odd, 
then n- 3 is even, so we can use Goldbach's conjecture to write 
”-3=p + c/asa sum of two primes p and q; but then n- p + q + 3 is a 
sum of three primes. Thus the revised version of the conjecture in 
Goldbach's letter is true. 
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Now assume that the revised version of the conjecture in Goldbach’s 
letter is true. Let n > 2 be an even integer. Then n + 2 > 5, so we can 
write n + 2 = p + q + r as a sum of three primes p, q, and r. But n + 2 is 
even, so not all three of p, q, and r can be odd; hence at least one of 
these primes is 2. Without loss, assume r = 2. Thus n + 2 = p + q + 2, 
and n — p + q is a sum of two primes. Therefore, Goldbach’s conjecture 
is true, and these two statement are equivalent. 


Chapter 11 


11.2 (c) 8 11 = 66 (mod 67) and 15” = 1 (mod 67), so 8 and 15 work, as 
Gauss discovered. 

11.3 (a) 
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11.6 (b) A Germain prime p > 3 must be of the form 6k + 5. 

11.8 (a) For q = 41 , when we compute the residues modulo 41 of each of the 

fifth powers I s , 2 5 , 3 s , ... , 40 s , we get the following residues: 

1, 3, 9, 14, 27, 32, 38, 40. 

Since none of these are consecutive, 41 must divide one of x, y, or z. 
Note that it is not necessary in this computation to go past the term 
20 s because at this point each residue will simply be the negative of a 
previous residue; for example, since 20 s = 32 (mod 41), we know that 
21 5 = (-20) 5 = -32 = 9 (mod 41); hence there is no need to compute 

21 s 40 s once we observe that we already have inverse pairs in our 

list of residues, and we do: 1 and 40, 3 and 38, 9 and 32, and 14 and 27. 


Chapter 12 

12.2 We can find all possible u-mora meters by adding an H to the 
beginning of each (n - 2)-mora meter and an L to the beginning of 
each ( n - l)-mora meter. For n = 5 we thus use the sets 

{ HL, LLL, LH } and {HLL, HH, LHL , LLLL, LLH) 
to produce all possible five-mora meters: 

[HHL, HLLL , HLH , LHLL, LHH , LLHL , LLLLL, LLLH}. 

12.3 (a) The sequence to the left of 0 looks almost exactly like it does to the 
right of 0: 

21. 13, -8, 5, -3, 2, -1, 1, 0. 1, 1. 2, 3, 5, 8, 13, 21, ... . 

12.4 (a) Since | is less than 1, the number ^ i is also less than 1. The 

number | - - can be written as the fraction Note that since ~ | , 

we know that f - i is nonnegative; hence the fraction is also 
nonnegative. 

Since | it follows that an - a < b, that is, an-b < a, and the 

numerator of this fraction is less than a as desired. Repeating this 
process produces a sequence of fractions with smaller and smaller 
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nonnegative numerators. The process stops when the numerator, 
an - b, equals 0, in which case, the fraction l j } equals i, a unit fraction. 

We still need to show why the unit fractions produced by this 
algorithm are distinct. To do this we show that the denominators of 
the unit fractions get larger at each stage. First, recall a basic property 
about fractions: if u and v are positive integers such that u < v, then 
“ < St (for example, § < §). 

So, since ^ < ~, we can use this basic property of fractions to 
conclude that f < f r Thus, an < 2b, which we rewrite as an - b < b. 
Therefore, < j- r But this means that at this stage an integer larger 
than n is required for the denominator of the next unit fraction. 

12.16 We begin with the case where n = 2k is even. So, recalling that af) = - 1 
and observing that a 2 + fk z — a + 1 + p + 1 = 3, we get 


fat+ifat-i - 1 - ^(« 2 * +1 -P zk+1 )(a 2k ^ - p 2k ~ l ) - 1 

= j (a 4k - a 2 (o/P) 2k ~ } - p 2 (ap) 2k ~ } + f] 4k - 5) 

= \ (a 4k + (a 2 + /3 2 ) - 5 + fi 4k ) = ~ ( a 4k - 2 + fi 4k ) 

0 v3 

1 . ^.Zk o2k x 7 


The case where n = 2k + 1 is odd is almost exactly the same; the only 
difference is that near the end a term (ap) 2k+] needs to be inserted 
instead of the term {afl) 2k we inserted above. 

12.19 (a) For n= 1, we see that F 3 - 1 = 2—1 = 1 = /q. Assume the identity 

is true for some n > 1. Then 


F] + F 2 + F 3 + • • • + F n + F n+ 1 = (F n+ 2 — 1) + F „+i = F n+ 3 — 1 , 


so the identity is true for all n. 
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(b) By Binet’s formula 

F\ + F 2 + ■ • • + F„ — -^= ((a + a 2 + • ■ • + a") - (p + p z + • • • + p")) 

= +°'+ Q ' 2 + • • • +a''“ 1 ) - p(l + ft + p 2 + • • • + P"~')) 

1 (a 2 fa"- 1\ p 2 fp"-l\\ 

_ 75 V« U-l 

= 7 ! ( “" +2 “ /8 ” +2 “ ( " 2 _ ^ = f "+ 2 - ! - 

sinceo'fo' - 1) = 1 and p(p - 1) = 1, and 
a 2 — p 2 = (a + P)(a - p) = a - p = 75. 

12.20 Because of the recursive property of the Fibonacci sequence, we can 
begin with two adjacent lxl squares, attach a 2 x 2 square to form a 
2x3 rectangle, then attach a 3 x 3 square to form a 3 x 5 rectangle, 
and so on. In this way, at any stage, we have produced an F n x F n+X 
rectangle built of n squares of size F 2 , F 2 , ... , F 2 . 

12.23 (a) By Theorem 12.3, represents the number of ways to tile a 

1 x (2 n - 1 ) board. We need to show that the left side of this identity 
also counts this same number of tilings. Since 2n - 1 is odd, any tiling 
of a 1 x (2 n - 1) board must use at least one square, and a total of at 
least n pieces. 

We group all of the tilings into n distinct groups by looking at the 
first n pieces in each tiling, and for each number i, where 1 < i < n, we 
group together those tilings that have exactly i squares among these 
first n pieces. We claim that for each i there are exactly (") F, tilings in 
the group whose tilings have i squares among the first n pieces. 

If there are i squares among the first n pieces, then there are n - i 
dominoes among these pieces, and these n pieces cover a total of 
i + 2 (n - i) = 2 n - i cells. Since there are (") different orders in which 
we can choose to place these i squares among the first n pieces, there 
are ways to cover the first 2 n — i cells. Since the remaining i - 1 cells 

can be covered in any way whatsoever, there are f ; _i = F, ways to do so. 
I hus there are a total of (”) F, tilings in the group whose tilings have i 
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squares among the first n pieces. This verifies the claim and completes 
the proof of the identity. 

12.28 The following n integers are consecutive: 

(n + 2)! + 3, (n + 2)1 + 4, (n + 2)! + 5, ... , (n + 2)\ + n + 2, 

and are divisible, respectively, by 3, 4. 5 n + 2. Therefore, by 

Theorem 12.6, the n consecutive Fibonacci numbers 

F(n+ 2)1+3, F( n+ 2 ) 1 + 4 , F(„ + 2)!+5, ■ ■ • , F(„+ 2)\+n+2 
are divisible, respectively, by F 3 , F 4 , F s , . . . , F n+ 2 . 

12.29 (b) We can express a positive integer n as a sum of distinct 
nonconsecutive Fibonacci numbers in the following way. If n is a 
Fibonacci number, we are done; otherwise, let F k , be the largest 
Fibonacci number less than n. Thus F k] < n < F kl + 1 . Next let 

= n - F*, , and we repeat the process using ti\ in place of //. Note that 

H] = n - Fki < Fki+i ~ Fki — Fki- 1 . 

so this process indeed produces distinct nonconsecutive Fibonacci 
numbers. Also, the process terminates when some n, is a Fibonacci 
number (perhaps at 1 = F 2 if not before) since the numbers 
n, n 1 , ■ ■ • , form a decreasing sequence of positive 

integers. 

Now we need to show that this is the only way to represent a positive 
integer as a sum of distinct nonconsecutive Fibonacci numbers; that is, 
this representation is unique. Suppose, by way of contradiction, that n 
is the smallest positive integer that has a representation other than the 
one produced above. Let n = F kl + F kl + ■ ■ ■ + F kr be the representation 
given above for n, and let n = Fj, + F /2 + ■ • ■ + F js be another 
representation where we assume that F;, > F, 2 > • ■ • > F ; - s . 

Because of the way the representation n = F k] + F kl + ■ ■ ■ + F kr was 
produced, we are guaranteed that F ;i < F k , ■ We claim that in fact 
Fji < F k i • This is clear if n is itself a Fibonacci number since in that case 
n = F^ > an d it is impossible to also have n = F ; , = F kl (since this would 
make both representations the same). On the other hand, if n is not a 
Fibonacci number, we still cannot have F ;i = F k] , because then the 
number n - F k , would have two different representations (namely, 

F kl + ■ ■ ■ + F k , and F / 2 + • • ■ + F /s ), contradicting the assumption that n 
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is the smallest number with two different representations; therefore, in 
this case, we also have F/ t < F*, . This verifies the claim. Now, we 
consider two cases. 

If k\ is odd, we write k\ = 2m + 1 . Note then that ;'i < 2m, which in 
turn means that j 2 < 2m - 2, and so on. Thus, by the identity in 
Problem 12.22, we get 


n - F h + -f/2 + ' ‘ ' + 

< F 2m + -f 2/71-2 + ■ ■ ■ + F 2 = F 2m +1 — 1 = F^ — 1 < tl. 


a contradiction. 

If k i is even, we write k\ = 2m + 2. Note then that j\ <2m+l, which 
in turn means that j 2 5 2m - 1, and so on. By the identity in Problem 
12.21, we get 


n — Fj\ + F j 2 + ■ ■ ■ + Fj s 


< F 2m+ 1 + F2m-1 + • • • + F\ = F 2 m+2 = F^ < tl, 


again a contradiction. Therefore, the representation for n is unique. 

12.31 The sequence is a Fibonacci sequence: 2, 3, 5, 8 Let s n be the 

number of ways to seat men and women in n chairs with this 
condition. Clearly, S\ —2 and s 2 = 3. The sequence will therefore be a 
Fibonacci sequence 2, 3, 5 , 8, ... , provided it satisfies the recurrence 
relation s„ = s„ \ + s„_ 2 . So, suppose n > 2 and consider all s„ 
arrangements of men and women in n chairs satisfying the condition. 
How many of these arrangements have a man sitting in the first 
chair? Since the remaining n — 1 chairs can have any arrangement of 
men and women in them (still satisfying the condition, of course) 
there ares„_i of these arrangements. How many of these arrangements 
have a woman sitting in the first chair? In each of these arrangements 
there must be a man sitting in the second chair, but then since the 
remaining n - 2 chairs can have any arrangement of men and women 
in them (still satisfying the condition) there are s„_ 2 of these 
arrangements. Therefore, s n = s„_ i + s„_ 2 , and the sequence is a 
Fibonacci sequence. 
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Note that while matrix multiplication is not in general a 
commutative operation, it is in this case. For example, we could also 
have computed M 3 and M 4 as follows: 



12.35 (a) This formula is true for n = 1 : F 0 + F 2 = 0 + 1 = 1 = Lj , and for 
n = 2: F\ + f 3 = 1 + 2 = 3 = L 2 . Assume the formula is true for all 
integers up to and including some integer n > 2. Then 


Ln + 1 — L n -\ + L n — (F„_2 + F n ) + (F„_i + F n+ 1 ) 


— (F n ~ 2 + F n - 1 ) + ( F n + F n+ 1 ) = F n + F n+ 2 , 
and the formula is true for all n. 

(b) L„_i + L n+ \ = 5F n , L\ + L 2 + Lt, + • • ■ + L„ = L , ;+ 2 ~ 3. 

12.36 (a) By Theorem 12.2, all we have to do is solve the system of equations 
2 = a + b, 1 = aa + b/3. But, since a + = 1, we can see by inspection 

that the solution is a = b = 1 ; therefore, the 
formula is 


L n - a" + p 11 . 

(b) We will first verify that 1 +a 2 = errand 1 + p 2 = ~p^5. This could 
be checked routinely by substituting the actual values for a and ft, but 
it is somewhat easier to recal 1 that aft = - 1 and a — ft = y/5; and so 


1 + a 2 — a(~b) + a 2 = a (a - ft) = a^S, 
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and 


1 + P 2 = P(-a) + P 2 = -P(a -0) = -flVS. 

Now, we can use the formula in Problem 12.35 along with Binet’s 
formula to write 


r „ „ a n ~ l - p "- 1 a" +1 - p n+1 

Ln — r„-\ + b n+ 1 = — h 


75 


75 


v n - 1 


(1 + O' 2 ) - p n -\l + p 2 ) a n ~\a^) - p n ~\-p*/S) 


75 


75 


= a" + p n . 


12.37 Since this has already been observed for the first three n x n 

determinants, all we need to do is show that these determinants follow 
the same recursive pattern that the Lucas numbers follow, namely that 
L n = L „_ 2 + L n - i • We will show this for the 6x6 determinant, but the 
argument for the n x n determinant is identical. 

We expand the 6x6 determinant about the sixth row and get 


3/0 0 0 0 
i 1 i 0 0 0 
0 / 1/00 
0 0 / 1/0 
0 0 0/ 1 i 
0 0 0 0 / 1 


3/000 
/ 1/00 
0/1/0 
0 0 / 1 / 
0 0 0 / 1 


3/000 
/ 1 / 0 0 
0 / 1/0 
0 0 / 10 
0 0 0/ i 


Now, this last 5x5 determinant on the right we can expand about 
the fifth column to get 


3 / 
/ 1 
0 / 
0 0 
0 0 


0 0 0 
/ 0 0 
1 / 0 
/ 1 0 
0 i i 



3 

/ 

0 

0 

. 

/ 

1 

/ 

0 


0 

/ 

1 

/ 


0 

0 

/ 

1 
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and so, we have 


3 / 
/ 1 
0 / 
0 0 

0 0 
0 0 


0 0 
/ 0 
1 / 
i 1 

0 / 
0 0 


0 0 
0 0 
0 0 
/ 0 

1 / 
i 1 


= 1 ■ 


3 / 
i 1 
0 / 
0 0 
0 0 


0 0 0 
/ 0 0 
1 / 
i 1 


0 / 



3 

/ 

0 

0 

,'2 

/ 

1 

/ 

0 

— 1 ■ 

0 

/ 

1 

/ 


0 

0 

/ 

1 


Therefore, 


3/0 0 0 0 
/ 1 / 0 0 0 
0 / 1/00 
0 0 / 1/0 
0 0 0 / 1 / 
0 0 0 0 / 1 


3/000 
/ 1/00 
0 / 1/0 
0 0 / 1 / 
0 0 0 / 1 



3 

/ 

0 

0 


/ 

1 

z 

0 

+ 

0 

/ 

1 

/ 


0 

0 

/ 

1 


as desired. 


12.38 The key observation is that 34 = 2 • 17 is the Fibonacci number F g . For 
any Fibonacci number F„, by Theorem 12.7, we have gcd(34, F„) = 
gcd(F 9 , F n ) = F g cd( 9 .»)- Therefore, the only possible values for 
gcd(34, F n ) are F\ = 1, F 3 = 2, or F 9 = 34. But if F n is an odd number, 
then the only possible value for gcd(34, F„) is 1 . Hence 17 Jf F„. 


Chapter 13 

13.6 The encrypted message is 


0228 0949 1904. 

13.7 The name of the rival’s perfume is “Annabel Lee.” 

13.8 First, he deciphers the message 


0309 1424 0442 1714 0451 
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by raising each of these five numbers to the exponent b = 445 modulo 
3127, and gets 


0013 1300 0104 1111 0404. 

and so he knows that the name of the perfume is Annabel Lee. 

Next, to confirm the signature he raises each of the five numbers in 
the second message 


0239 2471 2228 2629 0398 
to the exponent b = 445 modulo 3127, and gets 

0934 1057 2642 2371 2577, 

and he then raises each of these numbers to the 59th power modulo 
2881 (using the public key for the perfume company), which again 
yields 


0013 1300 0104 1111 0404, 
thus confirming the signature. 

13.9 Since M < n, we can, without loss of generality, assume that p\M, and 
that q and M are relatively prime. Since p\M, we also have that p\C 
(this is because C = M“ (mod n ) ), and so C b = M (mod p). 

Now, since ab = 1 (mod <p(n)), we can write ab = 1 + ktp{n), for some 
integer k. Also, since q and M are relatively prime, we can use Fermat's 
little theorem, Theorem 5.2, to conclude that M‘ ?_1 = 1 (mod q). 
Putting this together, we get 

C b =M ab =M x+mn) =M{M^- l ) k ^- 1) =M(\) k{ P- l) =M (mod q). 

Therefore, since C b = M (mod p ) and C b = M (mod q), we conclude 
that C b = M (mod n). 


Chapter 14 

14.1 [1, 2, 3, 4, 5] = fff. 

§|§ = [0, 4, 7.1.3.5,64], 


14.2 
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14.6 


1 

x 


[<?1 • <?2, > • • , <?«] 


= 0 + 


+ 


<?2 + 


<73 + 


4"- 1 + * 


= [ 0 , q\ . q 2 q n ]- 

14.7 Using Theorem 14.2, we generate the following table: 


k - 1 0 1 2 3 4 5 

q k 1 2 3 4 5 

a k 0 1 1 3 10 43 225 

b k 1 0 1 2 7 30 157 


and so cj = \,c 2 = §,c 3 


10 _ 43 

7 ’ L4 ~ 30’ Cs 


225 

157 ' 


14.9 (a) This formula holds for k = 1 because 


b\ bo 


q-i 1 

1 0 


Now assume that the formula holds for k 
is, assume that 


1 where 1 < k < n, that 


A**- 1 = A?i 1\ A?2 1\ Aft-i 1\ 

\b k I ^_ 2 y vi o/V 1 o; ' " Vi oj • 

Then, by Theorem 14.2, 

All 1\ A72 1\ Aj* 1\ Ai*-i fl*- 2 \ (q k l\ 

\i oj vi 0; \1 0) - v^_, b k _ 2 ) VI 0 / 

/ q k a k - \ + a k -2 a k-\ \ ( a k a k -]\ 

\q k b k ~ 1 4 - b k _ 2 b k -\ J \b k b k _\ J 


and the formula is verifed. 
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(b) For k = 1: 



gives us Ci = j. 


For k = 2: 



116 1 
1 0 


l 17 1\ 

116 j I, and so c 2 


117 

116 


For/; = 3; 


117 1 
116 1 



7 118 117\ J 118 

, , and so c-i = — 

V117 116y 117 


For k = 4; 


7 118 117\ 
\ 117 116 ) 



7353 118\ J 353 

\350 117 J’ andsoc 4 - 35Q. 


For k = 5: 


7353 118\ 
\350 117 J 



7824 353\ J 824 

on -im band so cs = . 

V817 350;’ 6 817 


For k = 6: 


7824 353\ 
\817 350 ) 



2001 824\ 2001 

1984 817 ) ’ 3nd S ° C6 _ 1984 ' 


14.10 In Problem 14.8, we saw that the convergents for the continued 

fraction [1. 1, 1. 1, 1, 1, 1] satisfied a k = F k+l and b k = F k . Thus, by 
Theorem 14.3, 


F k +\ F k -i - Fjf = a k b k _ i - b k a k —\ = 

Since this can be rewritten as Ff = + (~l) k +\ the identity 

follows. 

14.11 900a + 628 y = 400 reduces to 225a: + 157y = 100. So we first solve 
225a + 157 y = 1. But by Problem 14.1 we see that = [1, 2, 3, 4, 5], 
and by Problem 14.7 the convergents are: {, |, 42, |2, Then, 
applying Theorem 14.3 to the last two convergents, we get 

225(30) - 157(43) = (-1) 5 = -1, 

which we rewrite as 225(-30) + 157(43) = 1, and so a solution for the 
equation 225a + 157y = 1 is x = —30, y = 43; hence a solution for the 
equation 225a + 157y = 100 is a = —3000, y = 4300, which is also a 
solution for 900a + 628 y = 400. 

14.12 From Problem 14.7 the convergents are j, §, 42 |3 225 an( j so 

1 10 22S 40 o e 1 z / au lo / 7 

I < T < ls 7 ==; ' :< 3 o < 2 ’ as claimed. (You can check this using a 
calculator, but the only inequalities that are not obvious here are 
T < iff- which can be verified since 

10 x 157 = 1570 < 1575 = 7 x 225, and fff < which can be verified 
since 225 x 30 = 6750 < 6751 = 157 x 43.) 
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14.13 We can express 29.43 as the continued fraction [29, 2, 3, 14] and the 
convergents for this continued fraction are ^ We can 
rule out using the convergent 4^ = 29.5 since a gear with only two 
teeth is hardly practical. Similarly, with some regret, we rule out the 
convergent = 29.43 because a gear with 2943 teeth would be about 
20 feet across! Fortunately, the remaining convergent ^ % 29.4286 is 
not only an excellent approximation to 29.43, as Huygens discovered, 
but it it also quite practical to construct and use. 

14.14 (a) 1 day = 24 hours = 60 x 24 minutes = 60 x 60 x 24 seconds = 86 400 
seconds. 

5 hours + 48 minutes + 46 seconds = (5 x 60 x 60 + 48 x 60 + 46) 
seconds = 20 926 seconds = days = days. 

(b) 1600 (| -H§) = 12.5 days! 

(c) From 1600 to 1999, the Julian calendar will add 100 extra days; the 
Gregorian calendar adds only 97 days (since it omits February 29 in 
1700, 1800, and 1900). 

(d) = [0, 4, 7, 1, 3, 5, 64], and the convergents are 

0 1 7 8 31 163 10 463 

I’ 4’ 29’ 33’ 128’ 673’ 43 200 ' 

We could therefore add 8 days every 33 years and the error would be 
only .000 225 days per year. In fact, it would take 4441 years to be off by 
a single day! This system was proposed by Omar Khayyam and a 
commission of astronomers and was actually adopted by the sultan for 
some years, but then fell into disuse. Here is how such a calendar might 
work. Since this new calendar needs to operate on a 33-year cycle, we 
could begin this calendar in the year 2013, because 2013 is divisible by 
33. Note how nicely this works out because in the 33-year period 
2013-45 exactly 8 years are divisible by 4 (namely, 2016, 2020, 

. . . , 2044), so we merely add a leap year day in each of those 8 years. 
Then, in 2046, we begin a new 33-year cycle for the years 2046-78 and 
can do the same thing as before, adding a leap year day in years 
divisible by 4. The next cycle 2079-2111 also works perfectly. However, 
for the cycle 2112-44 we need to make a slight adjustment since both 
2112 and 2144 are divisible by 4; that is, 9 years in this cycle are 
divisible by 4. So we do not add a leap year day in 2112, but still add leap 
year days in 2116, 2120, . . . , 2144. This may sound complicated, but 
this calendar can easily be described as follows: 

Add February 29 to any year that is divisible by 4, unless that year 

is also divisible by 33. 
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We could do considerably better by adding 31 days every 128 years 
and in this case it would take 86 400 years to be off by a day. This 
calendar would also be easy to describe: 

Add February 29 to any year that is divisible by 4, unless that year 
is also divisible by 128. 

You might be thinking why not use the convergent or even ^63 
and get even better calendars. But any improvement at this point 
would be an illusion since from the beginning we only measured the 
actual length of a year to the nearest second, and the convergent ^ 
already agrees with this measurement to the nearest second. So any 
further improvement would require a more accurate measurement of 
how long it takes the Earth to revolve around the Sun. Moreover, the 
rotation of the Earth is actually slowing down, and so any calendar 
would eventually need an adjustment to account for this fact. 

14.17 The desired quadratic equation is 97x 2 - 206* + 96 = 0. 

14.18 (a) The rational number is 

(b) Since each a, is positive, we 

choose a\ = fl 2 = 1- Thus 2.71828 — 1 — 1 = .71828, and we choose 
«3 = 2 (so that ^ < .71828). So 2.71828 - 1 - 1 - \ = .21828, and we 
choose a 4 = 3 (so that < .21828). Next, 

2.71828 - l- l- l- i = .051613, and we choose a s = 4. Finally, then 
2.71828 - l- l- l- i- -L = .009947, and we choose a 5 = 5. Thus 
the first six terms for an ascending continued fraction representation 
of the number e is given by 


1 + 


1 + - 
5 


1 + 


1 + 


1 + 


(c) Since e* = 1 + * + f + • • ■ , we get 

_ 1 , 1 1 1 

C — -|- -|- -4- -I- . . . 

1 11 11-21- 1 -2-3 
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that is, e is represented by the ascending continued fraction 


1 + 


1 + • • ■ 


1 + 


e = 


where a } = 1 and each «,• = /-!, for i > 1. 

14.23 The convergents are 

2 3 8 11 19 87 106 193 1264 1457 2721 23 225 

r r 3’ 4 ’ 7 ’ 32' 39 ’ TP ~465~' 536~' 1001 ' 8544 ’ " ' ' 

Thus the eleventh convergent §§- % 2.718282 is accurate to six 
decimals because » .0000001 . 

14.26 (b) The possible values for m are 1 , 2, 3. 4. Then, for each m, here are 

the values of d that work: 

m= 1: d = 4, 5; 

m—2: d = 3, 4, 5, 6; 

m — 3: d = 2, 3, 4, 5, 6. 7; 

m = 4: d= 1, 2, 3, 4, 5, 6, 7, 8. 

(c) There are [pi J values for nv 1.2 [yTt j . Now, for each value of 

m, there are 2m positive integers in the interval between yh - m and 
s/n + m; hence 2m values for d. 

Therefore, the total number of quadratic surds that satisfy 
conditions (1) and (3) of Theorem 14.15 is 


2 + 4 + ... + 2[ v ytj =2(1+2 + = 2 ^^^^ +1 )j 


[Vn\{[Vn\ + l) = [y«J ■ T-s/n]. 
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14.27 First, we compute 


11 + 737 -1 + 737 

12 + 12 


1 + 


1 

12 


-1 + 737 


1 + 737 ' 
3 


At this point, we have q\ = 1 ; however, the new quadratic surd 1+ ^ is 
also such that its conjugate does not lie between -1 and 0. So, we 
continue 


1 + 737 - 5 + 737 „ 1 

-^ = 2 + ^^ = 2+ — — 

-5 + 737 


— 2 + 


1 

5 + 737 ' 
4 


At this point, we have q 2 = 2, and we also see that > 1 and that its 
conjugate lies between -1 and 0 (since 6 < 737 < 7). Therefore, 
we know that the continued faction for w qi p e purely periodic. 
The computation, with some detail left out, goes like this: 


1 

1 + 

1 

3 + 7=- 

5+737 
4 

and so we have q 2 = 2, q 4 = 1 , q 5 = 3; moreover, since we have reached 
the same quadratic surd as we had before, these three numbers 
will now repeat forever. Hence = [1,2, 2. 1, 3 ]. 

14.30 (a) Here k = 6, which is even, so we compute the first six convergents, 
and get 


5+737 


3+737 


= 2 


1 + 


4+737 


4 9 13 48 61 170 
T’ 2’ T’ TT' 14' "39"' 

Therefore, the smallest positive solution is x = 170, y = 39. 
(b) Computing (170 + 397197 for k = 2, 3, 4 produces 
k = 2: x = 57 799, y = 13 260; 
k = 3: x = 19 651 490, v = 4 508 361; 
k = 4: x = 6 681 448 801, y = 1 532 829 480. 
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Chapter 15 

15.1 p( 8) = 22. 

15.3 P( 4, 16) = 9. 

15.4 P(7, 50) = 522. 

15 . 7 For n = 8, the partitions having distinct parts are 

8, 7 + 1, 6 + 2, 5 + 3, 5 + 2 + 1, 4 + 3 + 1. Applying the transformation 
to the even partitions has the following effect: 

7 + 1 =+ 8; 6 + 2 =>5 + 2 + 1; 5 + 3 =>4 + 3 + 1. 

15.12 p(20) = 627. 


Brief Introduction to Sage 


Sage is free mathematical software that is an open-source alternative to 
Maple, Mathematica, Magma, and MATLAB. You can use it online at 
http://www.sagenb.org. 

When you go to this site you can create an account. You can then 
edit, save, and print your worksheets. Although Sage can be used in 
many areas of mathematics, in this introduction we will focus on its use 
in number theory. 

In order to do a calculation of some kind you simply enter an ex- 
pression into a box and click “evaluate.” However, in this introduction, 
rather than try to replicate this box format, we will indicate what hap- 
pens with a different notation for the sake of simplicity. For example, 
suppose you want to find 25!; then what you do is type factorial(25) 
into the box and click “evaluate.” Sage will almost instantly print your 
answer below the box: 15511210043330985984000000. Here is how we 
will indicate that interaction: 

sage: factorial (25) 

15511210043330985984000000 

You will be impressed by how easy it is to do many standard things in 
number theory. If you want to calculate a binomial coefficient ( 2 ) here 
it is: 

sage: binomial (4, 2) 

6 

Do you want to know if 17 divides 2012? 

sage: 17 . divides (2012) 

False 

or, you could find the remainder this way: 

sage: 2012 7. 17 
6 

Here is the gcd of 72 and 30: 

sage: gcd(72,30) 

6 
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Sage can even use the Euclidean algorithm as we did in Chapter 3 to 
write gcd(u, b ) as a linear combination of a and b: 

sage: d,x,y = xgcd(72,30); d,x,y 

(6, -2, 5) 

which tells us that 6 = (-2) • 72 + 5 ■ 30. 

You can test a number to see if it is square: 

sage: is_square (289) , is_square(10) 

(True, False) 

or perhaps you want to show that 2001 and 1984 are relatively prime by 
using prime factorization: 

sage: factor (2001) , factor(1984) 

(3*23*29, 2 a 6 * 31) 

There are many things we like to do with primes in number theory: 

sage: is_prime(1597) 

True 

sage: next_prime(1000) , previous_prime(1000) 

(1009, 997) 

sage: random_prime (1000) , random_prime (1000) 

(409, 137) 

sage: [p for p in prime_range(2,23)] 

[2, 3, 5, 7, 11, 13, 17, 19] 

sage: nth_prime(10) , nth_prime(1000) 

(29, 7919) 

sage: prime_divisors (1984) 

[2, 31] 

and here we test 2 67 - 1 to see if it is a Mersenne prime: 

sage: factor (2~67 - 1) 

193707721 * 761838257287 

Here is the Euler phi function: 

sage: euler_phi (10) , euler_phi(17) 

(4, 16) 

In Chapter 7, the key observation for finding a formula for <p(n) was that 
<p(p k ) = p k — p k ~ l when p is prime: 

sage: euler_ph.i (17~8) , 17~8 - 17~7 
(6565418768, 6565418768) 
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and the general formula is <p(n) = n{\ - j ) • • • (1 - ± ), where pi, p k 
are the prime divisors of n: 

sage: euler_phi (1000) 

400 

sage: 1000*prod( [1-1/p for p in prime_divisors(1000)] ) 

400 

Here is the sigma function: 

sage: divisors (28) ; sum(divisors(28) ) 

[1, 2, 4, 7, 14, 28] 

56 

showing that 28 is a perfect number because er(28) = 2-28, or we could 
use 

sage: sigma(28,l) == 2 * 28 
True 

Sage has a variety of ways for dealing with modular arithmetic: 

sage: Mod(72,30), Mod(2012,17) 

( 12 , 6 ) 

Here is how to find the inverse of 17 modulo 23: 

sage: inverse_mod(17,23) 

19 

It is convenient to define a function in order to do arithmetic modulo n: 

sage: modll = IntegerModRing(ll) 

then we can compute remainders 

sage: modll (60) 

5 

find inverses 

sage: modll (5" (-1) ) , modll(l/5) 

(9, 9) 

test whether a number is a square 

sage: is_square(modll(14)) 

True 

compute high powers 

sage: modll (7*4157845) 

10 
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In Chapter 7 we defined the order of a number a modulo n to be the 
least positive exponent k such that 2 k is congruent to 1 modulo n; if we 
define a function mod 10 as we defined mod 11 above, we can compute 
the order of 7 modulo 11 and the order of 3 modulo 10: 

sage : multiplicative_order (modll (7) ) 

10 

sage : multiplicative_order (modl0(3) ) 

4 

For a couple of final examples of Sage in action, suppose you want 
to find the first nine Fibonacci numbers, or the seventeenth Fibonacci 
number, or all the partitions of 4: 

sage: [f ibonacci (n) for n in ranged, 10)] 

[1, 1, 2, 3, 5, 8, 13, 21, 34] 

sage: f ibonacci (17) 

1597 

sage: list (partitions (4) ) 

[(1, 1, 1, 1), (1, 1, 2), (2, 2), (1, 3), (4,)] 


Suggestions for Further Reading 


As you would expect for a subject that has been studied for as long as number 
theory, there is an almost unlimited supply of excellent books and articles that 
can serve as guides for further study. Here then is an abbreviated list of a few 
resources I highly recommend. 

The one number theory book I consider indispensable was first published 
in 1938 and is based upon lectures delivered by the authors in a number 
of universities over a period of ten years. It remains to this day the classic 
introduction to number theory. 

G. H. Hardy and E. M. Wright, An Inroduction to the Theory of Numbers, 
6th edition, Oxford University Press, 2008. 

This latest edition contains a chapter byjoseph H. Silverman on modular 
elliptic curves and their role in proving Fermat’s last theorem as well as an 
introduction by Andrew Wiles (who finally proved Fermat’s last theorem 
in 1994). 

And, speaking of G. H. Hardy, I can’t resist urging you to read Hardy’s 
very personal account of why it is worthwhile to do mathematics in 
the first place and what in his opinion is the proper justification of a 
mathematician’s life. This thin volume contains a delightful foreword by 
C. P. Snow (best known for his 1959 lecture The Two Cultures in which he 
spoke of the deplorable, and seemingly unbridgeable, gulf between the 
sciences and the humantities). 

G. H. Hardy, A Mathematician's Apology, Cambridge University Press, 2012. 

Another classic book on number theory is not really a textbook but was 
instead written for the general reader. 

H. Davenport, The Higher Arithmetic, 8th edition, Cambridge University Press, 

2008. 

The following book is another classic and has the advantage that it 
introduces concepts from abstract algebra such as groups, rings, and 
fields. It also has a nice emphasis on the history of number theory. 

W. J. LeVeque, Fundamentals of Number Theory, Dover Publications, 1996. 

Without question the single best historical exposition of number 
theory is 

A. Weil, Number Theory: An Approach Through History from Hammurapi to Legen- 
dre, Birkhauser, 2007. 

Because this was the book that first made me fall in love with mathematics 
as an undergraduate, I also recommend the latest edition of 
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I. Niven, H. S. Zuckerman, and H. L. Montgomery, An Inroduction to the Theory 
of Numbers, 5th edition, John Wiley & Sons, 1991. 

In Chapter 2, 1 mentioned the Book, that imaginary volume that in the 
mind of Paul Erdos would contain all of the perfect proofs of mathemat- 
ical theorems. Fortunately, there is now a tangible approximation to the 
Book, which contains an entire section on number theory (including six 
proofs of the infinity of primes) and which benefited greatly from many 
pages of suggestions by Paul himself. 

M. Aigner and G. M. Ziegler, Proofs from THE BOOK, 4th edition, Springer, 2012. 

There are also many books that deal with specific topics in number theory. 
One of the very best of these is a fascinating and meticulously researched 
book on Pascal’s triangle. 

A. W.F. Edwards, Pascal's Arithmetical Triangle: The Story of a Mathematical Idea, 

John Hopkins University Press, 2002. 

Not surprisingly, one topic that has received much attention is the 350- 
year quest to prove Fermat’s last theorem. The first book listed below 
is based on an award-winning documentary film, while the other two 
books are considerably more advanced and use Fermat’s last theorem as 
motivation to study algebraic number theory. 

S. Singh, Fermat’s Enigma: The Epic Quest to Solve the World’s Greatest Mathemati- 
cal Problem, Anchor, 1998. 

1. Stewart and D. Tall, Algebraic Number Theory and Fermat’s Last Theorem, 
3rd edition, A K Peters/CRC Press, 2001. 

H. M. Edwards, Fermat’s Last Theorem: A Genetic Introduction to Algebraic Number 
Theory, Springer, 2000. 

Just before his death in 1996, Paul Erdos was able to see and enjoy the 
first copies of two volumes that illustrate the scope of his mathematical 
work. The first of these volumes, listed below, contains a moving tribute 
and overview of Paul’s remarkable life and work by his close friend Bela 
Bollobas, an article by Paul himself: Some of My Favorite Problems and 
Results (every lecture I ever heard Paul give had essentially the same title 
and this particular article captures perfectly the way in which Paul would 
stride effortlessly from one topic to another), and a substantial section on 
his work in number theory (of his first sixty papers all but two were related 
to number theory). 

R. L. Graham and Jaroslav Nesetril (eds.), The Mathematics of Paul Erdos I, 
Springer, 1996 (Algorithms and Combinations, vol. 13). 

Among the books that deal with the mathematics of Ramanujan perhaps 
the most accessible is one that focuses on his work in number theory. 

B. C. Berndt, Number Theory in the Spirit of Ramanujan, American Mathematical 

Society, 2006. 

There are also many books that are somewhat more biographical in 
nature. A good example is the following about Fibonacci and his work. 
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K. Devlin, The Man of Numbers: Fibonacci’s Arithmetic Revolution, Walker and 
Company, 2011. 

Two beautifully written books tell the compelling story of Ramanujan, 
the first a remarkably fine biography and the second a novel that is a 
masterpiece of historical fiction. 

R. Kanigel, The Man Who Knew Infinity: A Life of the Genius Ramanujan, Washing- 
ton Square Press, 1992. 

D. Leavitt, The Indian Clerk, Bloomsbury, 2008. 

Each of two wonderful biographies of Paul Erdos has its own charm: 

P. Hoffman, The Man Who Loved Only Numbers: The Story of Paul Erdos and the 
Search for Mathematical Truth, Hyperion, 1999. 

B. Schechter, MY BRAIN IS OPEN: The Mathematical journeys of Paul Erdos, Simon 
& Schuster, 2000. 

Turning now to articles on number theory— and again, there are more 
than one could possibly count— and many of these would provide excel- 
lent material for individual projects. I will begin by mentioning a mar- 
velous collection of forty articles on number theory, many of which have 
been awarded prizes and many of whose authors have won distinguished 
teaching awards. It is also a joy to find included in this book articles by 
Leonhard Euler (translated by George Polya), Paul Erdos, Ivan Niven, Carl 
I omerance, and Martin Gardner, as well as an article on the Riemann zeta 
function written especially for this book. 

A. T. Benjamin and E. Brown (eds.), Biscuits of Number Theory, The Mathematical 
Association of America, 2009. 

An excellent article to compare with Neugebauer's interpretation of 
Plimpton 322 as presented in Chapter 1 is the following. 

E. Robson, Words and pictures: New light on Plimpton 322, American Mathemat- 

ical Monthly 109(2) (February 2002), 105-20. 

A fascinating article involving Legendre illustrates the degree to which 
historical inaccuracies can be repeated over and over again and also 
explains why no portrait of Legendre was presented in Chapter 8. 

P. Duran, Changing faces: The mistaken portrait of Legendre, Notices of the 
American Mathematical Society 56(11) (December 2009), 1440-43. 

Finally, I will now simply list several articles I feel would be interesting 
and easily accessible for readers of this book. These have been chosen 
almost randomly from five journals —The American Mathematical Monthly, 
Mathematics Magazine, The College Mathematics journal, The Mathemati- 
cal Intelligencer, Notices of the American Mathematical Society— in order to 
suggest that finding interesting and suitable additional material about 
number theory is as easy as going to a library (or online) and browsing 
through back copies of these particular journals. 

F. Saidak, A new proof of Euclid’s theorem, American Mathematical Monthly 

113(10) (December 2006), 937-38. 
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J. Brillhart, A note on Euler’s factoring problem, American Mathematical Monthly 
116(10) (December 2009), 928-31. 

A. Granville, Prime number patterns, American Mathematical Monthly 115(4) 
(April 2008), 279-96. 

D. Pengelley and F. Richman, Did Euclid need the Euclidean algorithm to 
prove unique factorization? American Mathematical Monthly 113(3) (March 
2006), 196-205. 

D. Angell, Evaluation of certain Legendre symbols, American Mathematical 
Monthly 119(8) (October 2012), 690-93. 

T. J. Pickett and A. Coleman, Another continued fraction for it, American Mathe- 
matical Monthly 115(10) (December 2008), 930-33. 

I. D. Mercer, On Furstenberg’s proof of the infinitude of primes, American 

Mathematical Monthly 116(4) (April 2009), 355-56. 

H. Cohn, A short proof of the simple continued fraction expansion of e, 

American Mathematical Monthly 113(1) (January 2006), 57-62. 

S. Northshield, A short proof and generalization of Lagrange’s theorem on con- 
tinued fractions, American Mathematical Monthly 118(2) (February 2011), 
171-75. 

S. H. Weintraub, On Legendre's work on the law of quadratic reciprocity, 
American Mathematical Monthly 118(3) (March 2011), 210-16. 

N. D. Baruah, Ramanujan’s series for X/tz, American Mathematical Monthly 116(7) 
(August-September 2009), 567-87. 

J. Sondow, Ramanujan primes and Bertrand’s postulate, American Mathematical 

Monthly 116(7) (August-September 2009), 630-35. 

G. E. Andrews, Euler's pentagonal number theorem, Mathematics Magazine 
56(5) (November 1983), 279-84. 

I. Kleiner, Fermat: The founder of modern number theory, Mathematics Maga- 

zine 78(1) (February 2005), 3-14. 

M. Chamberland, Ramanujan’s 6-8-10 equation and beyond, Mathematics Mag- 
azine 82(2) (April 2009), 135-40. 

K. A. Ross, Repeating decimals: A period piece, Mathematics Magazine 83(1) 

(February 2010), 33-45 

P. Laosinchai, Proof without words: Sums of cubes, Mathematics Magazine 85(5) 
(December 2012), 360. 

R. Blecksmith and S. Brudno, Equal sums of three fourth powers or what 
Ramanujan could have said, Mathematics Magazine 79(4) (October 2006), 
299-301. 

J. H. Han and M. D. Hirschhorn, Another look at an amazing identity of 

Ramanujan, Mathematics Magazine 79(4) (October 2006), 302-4. 

C. Cooper and S. S. So (eds.), Problems and solutions: Even perfect numbers and 
Pythagorean triples, College Mathematics Journal 39(5) (November 2008), 
403-4. 

R. T. Harger and W. L. Hightower, An interesting property of x/tz{x), College 
Mathematics Journal 40(3) (May 2009), 213-14. 

R. C. Laubenbacher and D. J. Pengelley, Gauss, Eisenstein, and the “third” proof 
of the quadradic reciprocity theorem: Ein kleines Schauspiel, Mathematical 
Intelligencer 16(2) (Spring 1994), 67-72. 
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H. L. Montgomery and S. Wagon, A heuristic for the prime number theorem, 
Mathematical Intelligencer 28(3) (Summer 2006), 6-9. 

S. Goldstine, Dancing elves and a flower’s view of Euclid’s algorithm, Mathemat- 
ical Intelligencer 28(4) (Fall 2006), 23-30. 

M. Hardy and C. Woodgold, Prime simplicity, Mathematical Intelligencer 31(4) 
(Fall 2009), 44-52. 

K. Ono, The last words of a genius, Notices of the American Mathematical Society 
57(11) (December 2010), 1410-19. 
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Lame, Gabriel, 316 
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norm, 42 

numbers: abundant, 281; complex, 41, 
52-53, 88, 150, 181; composite, 23; 
deficient, 281; Euler pentagonal, 440; 
Fermat, 137-39, 227; Fibonacci, 324-49; 
harmonic, 302; hexagonal, 50; 
Hindu-Arabic, 325, 364; integers, 2; 
irrational, 26, 34, 38; Lucas, 360-61; 
Mersenne, 272-73; natural, ix, 2; 
pentagonal, 50, 440; perfect, 132-34, 


273-74; prime, 4-5, 22-23, 258-76, 
285-99; pyramidal, 32-33; rational, 23; 
regular, 6-7; relatively prime, 9-10; 
sexagesimal, 4-6; square, 7-8; 
squarefree, 82; tetrahedral, 29-33; 
trapezoidal, 50-51; triangular, 27-33 

one-time pad encryption system, 368-69 
order, 197 

Pacioli, Luca, 153, 329 
pairwise relatively prime, 20 
partial fractions, 348 
partial quotients, 77, 384, 395 
partitions, 433-52; ordered, 458 
Pascal, Blaise, 21 
Pascal’s triangle, 140-43, 339 
Pell, John, 102 

Pell’s equation, 101-2, 107-9, 380, 

409-24 

pentagonal numbers, 50, 440 

perfect numbers, 132-34, 273-74; odd, 274 

<p(n), 190 

Pingala, 324, 338 

jt(x), 251 

Platonic solids, the five, 225-26 

Plimpton, 322, 3-4, 9-10 

Plutarch, 29 

P(m,n), 435 

p(n), 433 

Pth 50 

Pn, 32 

Poe, Edgar Allan, 366 
Polignac, Alphonse de, 306 
polyalphabetic cypher, 367 
Pomerance, Carl, 263, 274 
power series, 347 
prime decomposition. See prime 
factorization 
prime element, 69 
prime factorization, 39, 43 
prime numbers, 4-5, 22-23, 258-76, 
285-99 

prime number theorem, the, 227 
prime testing, 267-68; probabilistic, 
271-72 

prime triplets, 287 

primitive Pythagorean triple, 2-3, 9-12 
primitive roots, 178, 195-99 
Proof, 309, 320 

proof by contradiction, 37-38 
proper divisor, 132 

psuedoprimes, 269-70; absolute, 270-71 
public key encryption systems, 259, 

370-74 


576 


Index 


purely periodic continued fraction, 399 
pyramidal numbers, 32-33 
Pythagoras, 26-29 
Pythagorean triangles, 1-2 
Pythagorean triple, 2; primitive, 2-16 
Pythagorean tuning, 44-46 

quadratic congruences, 167, 176-79 
quadratic nonresidues, 200-202, 230 
quadratic reciprocity, the law of, 228-30, 
239-50 

quadratic residues, 200-202, 229-30 
quadratic sieve method, the, 261-67 
quadratic surds, 399 

Ramanujan, Srinivasa, 450-57 
rational numbers, 23 
recurrence relations, 324, 326; first order, 
352; second order, 352 
reduced system of residues, 191 
reductio ad absurdum, 37 
regular numbers, 6-7 
relatively prime, 9-10 
repeating continued fraction, 80, 380 
Rhind Papyrus, the, 350-51 
Ribet, Ken, 317 
Rogers, L. J., 455 

Rogers-Ramanujan identities, the, 455 
RSA encryption systems, 259-61, 364, 
369-74 

Saunderson, Nicholas, 86 
Saunderson’s algorithm, 86-87 
secret codes, 365-66 
sexagesinal numbers, 4-6 
sieve of Eratosthenes, the, 251 
a(n), 219 

simple continued fractions, 382 

Sophie Germain’s theorem, 313-16 

squarefree, 82 

square numbers, 7-8 

Star of David property, the, 160 


strong induction, 40 
sums of four squares, 199 
sums of three squares, 207 
sums of two squares, 126-32 
SunZi, 171-74 
Sylvester, J. J., 446, 448 

Taniyama-Shimura conjecture, the, 317 
r(n), 83 

Tao, Terrence, 298 
tetrahedral numbers, 29-33 
Thabit ibn Qurra, 137, 281 
Tiling (and the Fibonacci numbers), 
337-43 
T», 30 

t„ (tilings), 337 
t„ (triangular numbers), 27 
transitive property, the, 81 
trapezoidal numbers, 50-51 
triangular numbers, 27-33 
twin prime conjecture, the, 286-87 
twin primes, 286-87 

unique factorization, 41-43 
unit fractions, 350-52 

de Vigenere, Blaise, 367 
Vigenere cipher, the, 366-70 

Wallis, John, 380 
Waring, Edward, 175, 207 
Waring’s problem, 207-9 
well-ordering principle, the, 41, 64 
Wigner, Eugene, xi 
Wiles, Andrew, 317 
Wilson's theorem, 174-76 

Yang Hui, 140 

Z, 1 

Zhu Shijie, 139-40 
Zu Chongzhi, 83 





MATHEMATICS 


The natural numbers have been studied for 
thousands of years, yet most undergraduate 
textbooks present number theory as a long 
list of theorems with little mention of how 
these results were discovered or why they 
are important. This book emphasizes the 
historical development of number theory, 
describing methods, theorems, and proofs 
in the contexts in which they originated, 
and providing an accessible introduction 
to one of the most fascinating subjects in 
mathematics. 

Written in an informal style by an 
award-winning teacher, Number Theory cov- 
ers prime numbers, Fibonacci numbers, 
and a host of other essential topics in num- 
ber theory, while also telling the stories of 
the great mathematicians behind these de- 
velopments, including Euclid, Carl Fried- 
rich Gauss, and Sophie Germain. This one- 
of-a-kind introductory textbook features 
an extensive set of problems that enable 
students to actively reinforce and extend 
their understanding of the material, as well 
as fully worked solutions for many of these 
problems. It also includes helpful hints for 
when students are unsure of how to get 
started on a given problem. 

• Uses a unique historical approach to 
teaching number theory 

• Features numerous problems, helpful 
hints, and fully worked solutions 

• Discusses fun topics like Pythagorean 
tuning in music, Sudoku puzzles, and 
arithmetic progressions of primes 

• Includes an introduction to Sage, an 
easy-to-learn yet powerful open-source 
mathematics software package 

• Ideal for undergraduate mathematics 
majors as well as non-math majors 

• Digital solutions manual (available only 
to professors at http://press. princeton 
.edu/titles/10 165.html) 
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